You might be thinking having started to read this blog that the primary use case of Data Replication API’s is to provide replication, well yes and no! No in a sense it won’t replicate your data for you, it actually won’t even return your data. What it will do is tell you what record Id’s have been updated or deleted within a certain time frame. What happens next is up to you, you don’t even have to do any replication if you don’t want to!
As someone who loves to keep things as native as possible, when answering this StackExchange post, i found it quite cool to find that this API is actually available to Apex developers! The API consist of two Database class methods getUpdated and getDeleted. The former returns new and updated records, its up to you to decide if its a new record or existing depending on if you’ve seen the record Id before or not.
Note: The process around these API’s is explained in more detail in the SOAP Developers Guide, so its worth reading this along with the Apex Developers Guide references. The general polling process is described here and limits and considerations here.
The API’s could not be easier to use, the following shows how to pull a list of record Id’s for updates made to records for a given object in the last hour.
Database.GetUpdatedResult r = Database.getUpdated( 'MyObject__c', Datetime.now().addHours(-1), Datetime.now());
Neither methods appear to consume any query or DML governor limits, though as noted above have some limits of their own. The GetUpdatedResult (described in more detail in the SOAP API documentation) contains a latestDateCovered field. This value should be retained and used in subsequent getUpdated calls.
Its unclear what Salesforce means by safety in the documentation, though the note relating to long running batch processes does make sense. This aspect highlights a key difference between attempting to query whats changed yourself based on the audit fields. The getDeleted method also works in the same way.
Its worth considering these API’s as a possible and lighter alternative to using Apex Triggers or UI’s to invoke custom behaviour when records are manipulated. Unlike Triggers you don’t know the specific changes, though Field Audit information could be queried if needed.
So be it a replication to another cloud via an Apex HTTP Callout or simply monitoring record activity into some org wide stats. Having Apex support allows you to keep things native via an Apex Schedule job to poll this API as often as you wish. With the advent of External Objects aka Lightning Connect other native possibilities start to present themselves…
March 27, 2016 at 3:27 am
Thanks Andy for pointing out the Replication API methods. Question, if I were just needing to replicate data from Salesforce to another system, wouldn’t querying the object directly save me an extra SOQL query? I’d have the data to then push to wherever.
I’m struggling to think of use cases where I’d want to start with just knowing the IDs that changed. I’m sure there are some use cases else the API wouldn’t exist.
The getDeleted() makes sense to me since those records aren’t included in standard SOQL query without the ALL ROWS keyword. If you’re replicating data you would now know which records to delete in the external system.
If the getUpdated() and getDeleted() methods returned all IDs regardless of object type I think that would be interesting, that certainly would reduce the number of queries needed to identify what’s going on in the system.
March 27, 2016 at 8:39 am
Great question Doug, i’ve updated the blog with some further comments that hopefully answer this, let me know.
March 27, 2016 at 8:55 pm
Thanks Andy. Sounds like this API may be more robust around the start/end dates to ensure developers know the correct IDs that have changed rather than relying on the audit fields.
If I were to use the Replication API off the platform, say in an ETL tool, then it would be a tad bit more burdensome to chunk up the returned list of IDs and construct a comma delimited string of IDs to include in my WHERE clause that ultimately queries for the record data. I guess that’s then weighing the pro/con of how certain the application needs to be on the record update dates vs. developer convenience in making the queries for the data.
March 27, 2016 at 8:58 pm
Yes agreed, there is also Heroku Connect, I believe this uses Streaming API, though until recently this was not production reliable.
March 28, 2016 at 12:39 pm
I had asked this Q on Stackexchange and superfell provided the excellent answer:
My concern was the scenario of overlapping (concurrent) transactions and keeping a high water mark (datetime) in my external app — I was concerned about missing an in-flight transaction if a shorter, more recent transaction completed first.
The docs you link to are good but it would be nice if SFDC provided a more illustrative example to explain “safety”
March 28, 2016 at 3:38 pm
Yes agree, let me see if I can get some more thoughts on this.