Andy in the Cloud

From BBC Basic to Force.com and beyond…


49 Comments

Handling Office Files and Zip Files in Apex – Part 2

In part 1 of this blog I talked about using the Salesforce Metadata API, Static Resources the PageReference.getContent method to implement a native unzip. This blog completes the series by introducing two Visualforce Components (and subcomponents) that provide full zip and unzip capabilities. By wrapping the excellent JSZip library behind the scenes. I gave myself three main goals…

  • Easy of use for none JavaScript Developers
  • Abstraction of the storage / handling of file content on the server
  • Avoiding hitting the Viewstate Governor!

While this audience and future developers I hope will be the judge of the first, the rest was a matter of selecting JavaScript Remoting. As the primary means to exchange the files in the zip or to be zipped between the client and server.

Unzip Component

The examples provided in the Github repo utilise a Custom Object called Zip File and Attachments on that record to manage the files within. As mentioned above, this does not have to be the case, you could easily implement something that dynamically handles and/or generates everything if you wanted!

As Open Office files are zip files themselves, the following shows the result of using the Unzip Demo to unzip a xlsx file. The sheet1.xml Attachment is the actual Sheet in the file, in its uncompressed XML form. Thus unlocking access to the data within! From this point you can parse it and update it ready for zipping.

Screen Shot 2012-12-08 at 10.20.47

As a further example, the following screenshot shows the results of unzipping the GitHub repo zip download

Screen Shot 2012-12-08 at 10.10.12

The c:unzipefile component renders a HTML file upload control to capture the file. Once the user selects a file, the processing starts. It then uses the HTML5 file IO support (great blog here) to read the data and pass it to the JSZip library. This is all encapsulated in the component of course! The following shows the component in action on the Unzip Demo page.

<c:unzipfile name="somezipfile" oncomplete="unzipped(state);"
 onreceive=
 "{!$RemoteAction.UnzipDemoController.receiveZipFileEntry}" />

Screen Shot 2012-12-08 at 10.09.02

As mentioned above it uses JavaScript Remoting, however as I wanted to make the component extensible in how the file entries are handled. I have allowed the page to pass in the RemoteAction to call back on. Which should look like this..

@RemoteAction
 public static String receiveZipFileEntry(
    String filename, String path, String data, String state)

In addition to the obvious parameters, the ‘state’ parameter allows for some basic state management between calls. Since of course JavaScript Remoting is stateless. Basically what this method returns is what gets sent back in the ‘state’ parameter on subsequent calls. In the demo provided, this contains the Id of the ZipFile record used to store the incoming zip files as attachments.

The other key attribute on the component is ‘oncomplete’. This can be any fragment of JavaScript you choose (the ‘state’ variable is in scope automatically). In the demo provided it calls out to an Action Function to invoke a controller method to move things along UI flow wise, in this case redirect to the Zip File record created during the process.

Zip Component

You may have noticed on the above screenshots I have placed a ‘Zip’ custom button on the Zip File objects layout. This effectively invokes the Zip Demo page. The use cases here is to take all the attachments on the record, zip them up, and produce a Document record and finally redirect to that for download.

Screen Shot 2012-12-09 at 10.16.11

Screen Shot 2012-12-09 at 10.17.02

The c:zipfile component once again wraps the JSZip library and leverages JavaScript Remoting to request the data in turn for each zip file entry. The page communicates the zip file entries via the c:zipentry component. These can be output explicitly at page rendering time (complete with inline base64 data if you wish) or empty leaving the component to request them via JS Remoting.

<c:zipfile name="someZipfile" state="{!ZipFile__c.Id}" 
  oncomplete="receiveZip(data);"
  getzipfileentry=
     "{!$RemoteAction.ZipDemoController.getZipFileEntry}">
   <apex:repeat value="{!paths}" var="path">
     <c:zipentry path="{!path}" base64="true"/>
   </apex:repeat>
</c:zipfile>

This component generates a JavaScript method on the page based on the name of the component, e.g. someZipfileGenerate. This must be called  at somepoint by the page to start the zip process.

The action method in the controller needs to look like this.

@RemoteAction
 public static String getZipFileEntry(String path, String state)

Once again the ‘state’ parameter is used. Except in this case it is only providing what was given to the c:zipfile component initially, it cannot be changed. Instead the method returns the Base64 encoded data of the requested zip file entry.

Finally the ‘oncomplete’ attributes JavaScript is executed and the resulting data is passed back (via apex:ActionFunction) to the controller for storage (not the binding in this case is transient, always good to avoid Viewstate issues when receiving large data) and redirects the user to the resulting Document page.

Summary

Neither components are currently giving realtime updates on the zip entries they are processing, so as per the messaging in the demo pages the user has to wait patiently for the next step to occur. Status / progress messages is something that can easily be implemented within the components at a future date.

These components utilise some additional components, I have not covered. Namely the c:zip and c:unzip components. If you have been following my expliots in the Apex Metdata API you may have noticed early versions of these in use in those examples, check out the Deploy and Retrieve demos in that repo.

I hope this short series on Zip file handling has been useful to some and once again want to give a big credit to the JSZip library. If you want to study the actual demo implementations more take a look at the links below. Thanks and enjoy!

Links


19 Comments

Generic Native SObject Data Loader

I’ve been wanting for a while now to explore using the Apex JSON serialize and deserialize support with SObject’s. I wanted to see if it was possible to build a generic native import/export solution natively, that didn’t just handle a single record was able to follow relationships to other records. Such as master-detail and lookup relationships. Effectively to give a means to extract and import a set of related records as one ‘bundle’ of record sets. With the flexibility of something like this, one could consider a number of uses…

  • Import/Export. Provide Import and Export solutions directly a native application without having to use Data Loader.
  • Generic Super Clone. Provide a super Clone button that can be applied to any object (without further coding), that not only clones the master record but any related child detail records.
  • Post Install Setup. Populate default data / configuration as part of a Package post installation script.
  • Org 2 Org Data Transfer. Pass data between orgs, perhaps via an attachment in an email received via an Apex Email Handler that receives an email from Apex code in anthor org?

For me, some interesting use cases and touch apone a number of topics I see surfacing now and again on Salesforce StackExchange and in my own job. However before having a crack at those I felt I first needed to crack open the JSON support in Apex in a generic way or at least with some minimal config.

Background and Credit

A fellow developer Agustina García had recently done some excellent research into Apex JSON support. She found that while JSON.serialise would in fact work with SObject’s containing related child records (queried using subqueries), by outputting the child records within the JSON (as nested JSON objects). Sadly however, the JSON.deserialize method ignores them. The solution was to utilise some wrapper Apex classes to contain the SObject records and provide the necessary relationship structure. This gave me the basis to develop something more generic that would work with any object.

Introducing SObjectDataLoader

Like the JSON class, the SObjectDataLoader class has two main methods, ‘serialize‘ and ‘deserialize‘. To serialize records all you need give it is a list of ID’s. It will automatically discover the fields to include and query the records itself. Seeking out (based on some default rules) related child and lookup relationship records to ‘follow’ as part of the recursive serialisation process. Outputting a serialised RecordsBundle. Due to how the serialisation process executes the RecordSetBundle’s are output in dependency order.

Consider the following objects and relationships with a mix of master-detail and lookup relationships as shown using the Schema Builder tool.

Here is a basic example (utilising auto configure mode) that will export records in the above structure, when given the Id of the root object, Object A.

String serialisedData =
  SObjectDataLoader.serialize(
      new Set<Id>; { 'a00d0000007kUms' });

The abbreviated resulting JSON looks like this (see full example here)…

{
    "RecordSetBundles": [
        {
            "Records": [
                {
                    "Id": "a00d0000007kUmsAAE",
                    "Name": "Object A Record"
                }
            ],
            "ObjectType": "ObjectA__c"
        },
        {
            "Records": [
                {
                    "Id": "a03d000000EHi6tAAD",
                    "Name": "Object D Record",
                }
            ],
            "ObjectType": "ObjectD__c"
        },
        {
            "Records": [
                {
                    "Id": "a01d0000006JdysAAC",
                    "Name": "Object B Record",
                    "ObjectA__c": "a00d0000007kUmsAAE",
                    "ObjectD__c": "a03d000000EHi6tAAD"
                }
            ],
            "ObjectType": "ObjectB__c"
        },
        {
            "Records": [
                {
                    "Id": "a04d00000035cFAAAY",
                    "Name": "Object E Record"
                }
            ],
            "ObjectType": "ObjectE__c"
        },
        {
            "Records": [
                {
                    "Id": "a02d000000723fvAAA",
                    "Name": "Object C Record",
                    "ObjectB__c": "a01d0000006JdysAAC",
                    "ObjectE__c": "a04d00000035cFAAAY"
                }
            ],
            "ObjectType": "ObjectC__c"
        }
    ]
}

The ‘deserialize‘ method takes a JSON string representing a RecordsBundle (output from the ‘serialize’ method). This will insert the records in the required order and take care of making new relationships (as you can see the original Id’s are retained in the JSON to aid this process) as each record set is processed.

Set<Id> recordAIds =
    SObjectDataLoader.deserialize(serialisedData);

The auto configuration route has its limitations and assumptions, so a manual configuration mode is available. This can be used if you know more about the objects. Though you can merge both configuration modes. The test methods in the class utlise some more advanced usage. For example…

String serializedData =
    SObjectDataLoader.serialize(createOpportunities(),
       new SObjectDataLoader.SerializeConfig().
        // Serialize any related OpportunityLineItem's
        followChild(OpportunityLineItem.OpportunityId).
          // Serialize any related PricebookEntry's
          follow(OpportunityLineItem.PricebookEntryId).
            // Serialize any related Products's
            follow(PricebookEntry.Product2Id).
            // Skip UnitPrice in favour of TotalPrice
            omit(OpportunityLineItem.UnitPrice));

The deserialize method in this case takes an implementation of SObjectDataLoader.IDeserializerCallback. This allows you to intercept the deserialization process and populate any references before the objects are inserted into the database. Useful if the exported data is incomplete.

Set<ID>; resultIds =
   SObjectDataLoader.deserialize(
      serializedData, new ApplyStandardPricebook());

Some interesting aspects of the implementation…

  • It discovers the SObjectType via the new method Id.getSObjectType.
  • It uses the SObject.clone method to remove the old Id (after making a note of it for later reference) from the deserialized SObject allowing it to be inserted.
  • The native JSON.serialize and JSON.deserialize are used only once to serialise and deserialize the internal RecordsBundle object.
  • The SerializeConfig object uses the Fluent API model.
  • JSONLint rocks!
  • Unit tests with over 95% code coverage are included in the same class, so the single class is ready to drop in your projects!

Summary

I hope this will be of use to people and would love to hear about more use cases and/or offers to help extend (see TODO’s below if you fancy helping) or fix any bugs in it to grow it further. Maybe one day Salesforce might even consider implementing into the Apex runtime! In the meantime, enjoy! GitHub repo link.

TODO’s

This is an initial release that I think will suite some use cases within the current tolerances of the implementation. As always, life and work commitments prevent me from spending as much time on this as I would really like. So here are a few TODO’s, until next time…

  • Support Recursive Lookup References, e.g. Custom Object A has a lookup to Custom Object A.
  • Scalability, Improve scalability via Batch Apex
  • More Extensibility and Callbacks, I add a deserialization Apex callback interface to resolve dynamically required references that have not been included in the serialised output. This can be extended further to customise additional fields or change existing ones.
  • Optimise queries and ‘fields’ usage, (for repeat references to the same object further down the follow chain).