Salesforce offers two APIs to access and manipulate your cloud data – SOAP and REST, the latter called “Bulk API”. Although you can use both in CloverETL, you'd probably use CloverETL to manipulate large portions of data. Therefore, the Bulk Salesforce API is more likely your preferred approach.
To see why, let’s take a look at what Salesforce documentation has to say about Bulk Salesforce API:
“Bulk API is based on REST principles and is optimized for loading or deleting large sets of data. You can use it to query, insert, update, upsert, or delete a large number of records asynchronously by submitting batches which are processed in the background by Salesforce.
The documentation continues with suitable use-cases:
SOAP API, in contrast, is optimized for real-time client applications that update small numbers of records at a time. Although SOAP API can also be used for processing large numbers of records, when the data sets contain hundreds of thousands of records, it becomes less practical. Bulk API is designed to make it simple to process data from a few thousand to millions of records.”
In my experience, the above excerpt proves to be more than true. Using SOAP calls to insert contacts into the database can be convenient for interactive applications; however, when you try to call insert for every record in a batch, it can yield only a few calls per second (usually around 10) – this obviously is not an option. The same holds for SOAP version of query() call. It's powerful, but designed only for highly specific queries returning maybe hundreds of results, an option great for searches. On top of the performance issues, there is a hard limit of 2,000 items returned per a call. What's more, in reality it would typically be much less, usually varying from 50 to 1,500 per call depending on record sizes. Although repeating the call with paging offset is possible, it’s an unnecessary hassle.
On the other hand, Bulk API works on batches of either CSV or XML data and is extremely efficient for both inserts and queries. There are limits too, but it's much better suited for batch processing. Most notably, a query can return up to 15 files of 1GB each, totaling to a 15 GB result set. Refer to Bulk API limits page for further details including limitations of SOQL queries used in Bulk API.
November 29, 2013
On December 9, 2009 CloverETL Cluster Edition was launched at PriceWaterhouseCoopers premises. CloverETL Cluster intelligently partitions data and distributes them evenly across multiple nodes in a cluster for execution in parallel. CloverETL Cluster’s ability to load balance large data transformations increases throughput, fault tolerance and flexibility.
December 18, 2009
Data integration software and ETL tools provided by the CloverDX platform (formerly known as CloverETL) offer solutions for data management tasks such as data integration, data migration, or data quality. CloverDX is a vital part of enterprise solutions such as data warehousing, business intelligence (BI) or master data management (MDM). CloverDX Designer (formerly known as CloverETL Designer) is a visual data transformation designer that helps define data flows and transformations in a quick, visual, and intuitive way. CloverDX Server (formerly known as CloverETL Server) is an enterprise ETL and data integration runtime environment. It offers a set of enterprise features such as automation, monitoring, user management, real-time ETL, data API services, clustering, or cloud data integration.