Andy in the Cloud

From BBC Basic to Force.com and beyond…

Generic Native SObject Data Loader

19 Comments

I’ve been wanting for a while now to explore using the Apex JSON serialize and deserialize support with SObject’s. I wanted to see if it was possible to build a generic native import/export solution natively, that didn’t just handle a single record was able to follow relationships to other records. Such as master-detail and lookup relationships. Effectively to give a means to extract and import a set of related records as one ‘bundle’ of record sets. With the flexibility of something like this, one could consider a number of uses…

  • Import/Export. Provide Import and Export solutions directly a native application without having to use Data Loader.
  • Generic Super Clone. Provide a super Clone button that can be applied to any object (without further coding), that not only clones the master record but any related child detail records.
  • Post Install Setup. Populate default data / configuration as part of a Package post installation script.
  • Org 2 Org Data Transfer. Pass data between orgs, perhaps via an attachment in an email received via an Apex Email Handler that receives an email from Apex code in anthor org?

For me, some interesting use cases and touch apone a number of topics I see surfacing now and again on Salesforce StackExchange and in my own job. However before having a crack at those I felt I first needed to crack open the JSON support in Apex in a generic way or at least with some minimal config.

Background and Credit

A fellow developer Agustina García had recently done some excellent research into Apex JSON support. She found that while JSON.serialise would in fact work with SObject’s containing related child records (queried using subqueries), by outputting the child records within the JSON (as nested JSON objects). Sadly however, the JSON.deserialize method ignores them. The solution was to utilise some wrapper Apex classes to contain the SObject records and provide the necessary relationship structure. This gave me the basis to develop something more generic that would work with any object.

Introducing SObjectDataLoader

Like the JSON class, the SObjectDataLoader class has two main methods, ‘serialize‘ and ‘deserialize‘. To serialize records all you need give it is a list of ID’s. It will automatically discover the fields to include and query the records itself. Seeking out (based on some default rules) related child and lookup relationship records to ‘follow’ as part of the recursive serialisation process. Outputting a serialised RecordsBundle. Due to how the serialisation process executes the RecordSetBundle’s are output in dependency order.

Consider the following objects and relationships with a mix of master-detail and lookup relationships as shown using the Schema Builder tool.

Here is a basic example (utilising auto configure mode) that will export records in the above structure, when given the Id of the root object, Object A.

String serialisedData =
  SObjectDataLoader.serialize(
      new Set<Id>; { 'a00d0000007kUms' });

The abbreviated resulting JSON looks like this (see full example here)…

{
    "RecordSetBundles": [
        {
            "Records": [
                {
                    "Id": "a00d0000007kUmsAAE",
                    "Name": "Object A Record"
                }
            ],
            "ObjectType": "ObjectA__c"
        },
        {
            "Records": [
                {
                    "Id": "a03d000000EHi6tAAD",
                    "Name": "Object D Record",
                }
            ],
            "ObjectType": "ObjectD__c"
        },
        {
            "Records": [
                {
                    "Id": "a01d0000006JdysAAC",
                    "Name": "Object B Record",
                    "ObjectA__c": "a00d0000007kUmsAAE",
                    "ObjectD__c": "a03d000000EHi6tAAD"
                }
            ],
            "ObjectType": "ObjectB__c"
        },
        {
            "Records": [
                {
                    "Id": "a04d00000035cFAAAY",
                    "Name": "Object E Record"
                }
            ],
            "ObjectType": "ObjectE__c"
        },
        {
            "Records": [
                {
                    "Id": "a02d000000723fvAAA",
                    "Name": "Object C Record",
                    "ObjectB__c": "a01d0000006JdysAAC",
                    "ObjectE__c": "a04d00000035cFAAAY"
                }
            ],
            "ObjectType": "ObjectC__c"
        }
    ]
}

The ‘deserialize‘ method takes a JSON string representing a RecordsBundle (output from the ‘serialize’ method). This will insert the records in the required order and take care of making new relationships (as you can see the original Id’s are retained in the JSON to aid this process) as each record set is processed.

Set<Id> recordAIds =
    SObjectDataLoader.deserialize(serialisedData);

The auto configuration route has its limitations and assumptions, so a manual configuration mode is available. This can be used if you know more about the objects. Though you can merge both configuration modes. The test methods in the class utlise some more advanced usage. For example…

String serializedData =
    SObjectDataLoader.serialize(createOpportunities(),
       new SObjectDataLoader.SerializeConfig().
        // Serialize any related OpportunityLineItem's
        followChild(OpportunityLineItem.OpportunityId).
          // Serialize any related PricebookEntry's
          follow(OpportunityLineItem.PricebookEntryId).
            // Serialize any related Products's
            follow(PricebookEntry.Product2Id).
            // Skip UnitPrice in favour of TotalPrice
            omit(OpportunityLineItem.UnitPrice));

The deserialize method in this case takes an implementation of SObjectDataLoader.IDeserializerCallback. This allows you to intercept the deserialization process and populate any references before the objects are inserted into the database. Useful if the exported data is incomplete.

Set<ID>; resultIds =
   SObjectDataLoader.deserialize(
      serializedData, new ApplyStandardPricebook());

Some interesting aspects of the implementation…

  • It discovers the SObjectType via the new method Id.getSObjectType.
  • It uses the SObject.clone method to remove the old Id (after making a note of it for later reference) from the deserialized SObject allowing it to be inserted.
  • The native JSON.serialize and JSON.deserialize are used only once to serialise and deserialize the internal RecordsBundle object.
  • The SerializeConfig object uses the Fluent API model.
  • JSONLint rocks!
  • Unit tests with over 95% code coverage are included in the same class, so the single class is ready to drop in your projects!

Summary

I hope this will be of use to people and would love to hear about more use cases and/or offers to help extend (see TODO’s below if you fancy helping) or fix any bugs in it to grow it further. Maybe one day Salesforce might even consider implementing into the Apex runtime! In the meantime, enjoy! GitHub repo link.

TODO’s

This is an initial release that I think will suite some use cases within the current tolerances of the implementation. As always, life and work commitments prevent me from spending as much time on this as I would really like. So here are a few TODO’s, until next time…

  • Support Recursive Lookup References, e.g. Custom Object A has a lookup to Custom Object A.
  • Scalability, Improve scalability via Batch Apex
  • More Extensibility and Callbacks, I add a deserialization Apex callback interface to resolve dynamically required references that have not been included in the serialised output. This can be extended further to customise additional fields or change existing ones.
  • Optimise queries and ‘fields’ usage, (for repeat references to the same object further down the follow chain).

19 thoughts on “Generic Native SObject Data Loader

  1. Looking forward testing on Bank Format !!

  2. Great work. Thanks for sharing. Is you code only serializing an objects childs but also all direct and indirect parents? It would really be cool to have a tool where a user just says I need to export that object. Please crawl me the relationships and find everything that is conncected and export it to JSON.

    • Hi Robert, this is indeed exactly what it does. If its auto crawl is not what you want you can then tweak it via the config object. Or if you prefer you can teach it from scratch what to crawl. Thanks.

      • Object D and E are examples of this kind of relationship in fact. 🙂

      • I meant more object like Object B. How would I have to call you code to get Object B including all (in a way required) children and parents?

      • The auto crawl code will stop eventually crawling the references to avoid loops and eventually governor errors. However the object model shown in the post is supported, in that object b’s children (object c) will get included by default. Note also if you want to capture more root level records and hence their related records, just pass more Id’s into the top level call. If you have a diagram such as the one i show via the schema builder that you want me to take a look at. Please feel free to send and I’ll let you know if the default crawling will handle it or if you will need to give it more hints.

      • Hi Andrew, I have a for more than 3 child relationship in data loader class. please see my question and provide me some solution. Thanks.
        https://salesforce.stackexchange.com/questions/198957/sobjectdataloader-class-is-not-able-to-search-relationship-for-more-than-3-child

      • I would recommend you provide more information. What have tried? Different number of levels? What is returned from the method? Have you debugged the code?

  3. The structure is like this: https://dl.dropbox.com/u/240888/uml.png
    I would like to pass the id of the top P object and get the whole tree. No matter how deep and no matter of the children are MDRs or lookups. Is that possible?

    BTW: Thanks for you help and you great code contributions 🙂

    • This does not appear that different from the example I used in the blog post, where I used a three level deep master detail with some lookup relationships. Have you given it a try? The class is self contained (complete with tests) and requires a single line to invoke it with the top P object Id. Let me know how you get on, if you have any issues raise them on the Github Issues and we can work together to resolve! https://github.com/afawcett/apex-sobjectdataloader/issues

  4. This looks good, but I get “Apex governor limit” error when I pass single account Id?

    sfcloud1:Number of SOQL queries: 80 out of 100
    sfcloud1:Number of fields describes: 101 out of 100
    sfcloud1:Number of child relationships describes: 67 out of 100

    Am I doing something wrong here? I created issue under GitHub as well.

    Thank you for your time.

    Mitesh

    • The auto configure option follows relationships from the root object, so I assume you have a lot of those hanging from Account. You will need to manually configure or adjust to omit some relationship paths you don’t want it to follow. Can you dump via JSOn serialise the config object state, this should give you an idea of what the auto config mode is trying to do. Btw can you share the link to the GitHub issue you raised.

  5. Did you ever find the time to improve your fabulous solution regarding scalability (meaning more records using Batch Apex)? I just had a problem where I would love to use you code but failed miserably due to Large Data Volumes (10.000 -100.000 related records)

    • Not yet, though we happen to be looking at a project internally that needs this from this library also, once we get it done I will look into sharing.

  6. Just merged a pull request providing an excellent contribution from Sonal4J update the library to support self references, from the developer “added ‘LastViewedDate’, ‘LastReferencedDate’ to the whitelist. It also allows to add user to serialize List containing Id’s of different objects . While deserializing Self reference fields relationship is maintained. Test methods are added to test class.”

  7. Andrew – That’s indeed great work. From your blog, I came to understand that we need to pass set of Ids to serialize method. Something like this:
    Set clonedMasterIds =
    SObjectDataLoader.deserialize( SObjectDataLoader.serialize(masterIdsToDeepClone) );
    However, when I do that, I get some “implementation error”. Can you tell me how to fix this?
    Also, will it clone the master object, child and grandchild object? Any chances of hitting the governor limit????

    • Yes it will support master, child, grandchild. And yes with a lot of data governors will kick in. If your getting an error can you raise a GitHub Issue please regarding your error.

  8. Hi Andy,

    can you help me in getting product names of an opportunity through dynamic apex?

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s