UPDATE: With CloverETL 4.3 released in summer 2016, we introduced Reader and Writer components for Salesforce Bulk API. You no longer need to follow the instructions below in order to read, insert, update or delete data in Salesforce. Read Salesforce Connector in CloverETL to get more details.
Salesforce offers two APIs to access and manipulate your cloud data – SOAP and REST, the latter called “Bulk API”. Although you can use both in CloverETL, you'd probably use CloverETL to manipulate large portions of data. Therefore, the Bulk Salesforce API is more likely your preferred approach.
To see why, let’s take a look at what Salesforce documentation has to say about Bulk Salesforce API:
“Bulk API is based on REST principles and is optimized for loading or deleting large sets of data. You can use it to query, insert, update, upsert, or delete a large number of records asynchronously by submitting batches which are processed in the background by Salesforce.
The documentation continues with suitable use-cases:
SOAP API, in contrast, is optimized for real-time client applications that update small numbers of records at a time. Although SOAP API can also be used for processing large numbers of records, when the data sets contain hundreds of thousands of records, it becomes less practical. Bulk API is designed to make it simple to process data from a few thousand to millions of records.”
SOAP Salesforce API – A Bad Idea for Batch Processing
In my experience, the above excerpt proves to be more than true. Using SOAP calls to insert contacts into the database can be convenient for interactive applications; however, when you try to call insert for every record in a batch, it can yield only a few calls per second (usually around 10) – this obviously is not an option. The same holds for SOAP version of query() call. It's powerful, but designed only for highly specific queries returning maybe hundreds of results, an option great for searches. On top of the performance issues, there is a hard limit of 2,000 items returned per a call. What's more, in reality it would typically be much less, usually varying from 50 to 1,500 per call depending on record sizes. Although repeating the call with paging offset is possible, it’s an unnecessary hassle.
Bulk Salesforce API
On the other hand, Bulk API works on batches of either CSV or XML data and is extremely efficient for both inserts and queries. There are limits too, but it's much better suited for batch processing. Most notably, a query can return up to 15 files of 1GB each, totaling to a 15 GB result set. Refer to Bulk API limits page for further details including limitations of SOQL queries used in Bulk API.
November 29, 2013