Andy in the Cloud

From BBC Basic to Force.com and beyond…

Handling Office Files and Zip Files in Apex – Part 2

49 Comments

In part 1 of this blog I talked about using the Salesforce Metadata API, Static Resources the PageReference.getContent method to implement a native unzip. This blog completes the series by introducing two Visualforce Components (and subcomponents) that provide full zip and unzip capabilities. By wrapping the excellent JSZip library behind the scenes. I gave myself three main goals…

  • Easy of use for none JavaScript Developers
  • Abstraction of the storage / handling of file content on the server
  • Avoiding hitting the Viewstate Governor!

While this audience and future developers I hope will be the judge of the first, the rest was a matter of selecting JavaScript Remoting. As the primary means to exchange the files in the zip or to be zipped between the client and server.

Unzip Component

The examples provided in the Github repo utilise a Custom Object called Zip File and Attachments on that record to manage the files within. As mentioned above, this does not have to be the case, you could easily implement something that dynamically handles and/or generates everything if you wanted!

As Open Office files are zip files themselves, the following shows the result of using the Unzip Demo to unzip a xlsx file. The sheet1.xml Attachment is the actual Sheet in the file, in its uncompressed XML form. Thus unlocking access to the data within! From this point you can parse it and update it ready for zipping.

Screen Shot 2012-12-08 at 10.20.47

As a further example, the following screenshot shows the results of unzipping the GitHub repo zip download

Screen Shot 2012-12-08 at 10.10.12

The c:unzipefile component renders a HTML file upload control to capture the file. Once the user selects a file, the processing starts. It then uses the HTML5 file IO support (great blog here) to read the data and pass it to the JSZip library. This is all encapsulated in the component of course! The following shows the component in action on the Unzip Demo page.

<c:unzipfile name="somezipfile" oncomplete="unzipped(state);"
 onreceive=
 "{!$RemoteAction.UnzipDemoController.receiveZipFileEntry}" />

Screen Shot 2012-12-08 at 10.09.02

As mentioned above it uses JavaScript Remoting, however as I wanted to make the component extensible in how the file entries are handled. I have allowed the page to pass in the RemoteAction to call back on. Which should look like this..

@RemoteAction
 public static String receiveZipFileEntry(
    String filename, String path, String data, String state)

In addition to the obvious parameters, the ‘state’ parameter allows for some basic state management between calls. Since of course JavaScript Remoting is stateless. Basically what this method returns is what gets sent back in the ‘state’ parameter on subsequent calls. In the demo provided, this contains the Id of the ZipFile record used to store the incoming zip files as attachments.

The other key attribute on the component is ‘oncomplete’. This can be any fragment of JavaScript you choose (the ‘state’ variable is in scope automatically). In the demo provided it calls out to an Action Function to invoke a controller method to move things along UI flow wise, in this case redirect to the Zip File record created during the process.

Zip Component

You may have noticed on the above screenshots I have placed a ‘Zip’ custom button on the Zip File objects layout. This effectively invokes the Zip Demo page. The use cases here is to take all the attachments on the record, zip them up, and produce a Document record and finally redirect to that for download.

Screen Shot 2012-12-09 at 10.16.11

Screen Shot 2012-12-09 at 10.17.02

The c:zipfile component once again wraps the JSZip library and leverages JavaScript Remoting to request the data in turn for each zip file entry. The page communicates the zip file entries via the c:zipentry component. These can be output explicitly at page rendering time (complete with inline base64 data if you wish) or empty leaving the component to request them via JS Remoting.

<c:zipfile name="someZipfile" state="{!ZipFile__c.Id}" 
  oncomplete="receiveZip(data);"
  getzipfileentry=
     "{!$RemoteAction.ZipDemoController.getZipFileEntry}">
   <apex:repeat value="{!paths}" var="path">
     <c:zipentry path="{!path}" base64="true"/>
   </apex:repeat>
</c:zipfile>

This component generates a JavaScript method on the page based on the name of the component, e.g. someZipfileGenerate. This must be called  at somepoint by the page to start the zip process.

The action method in the controller needs to look like this.

@RemoteAction
 public static String getZipFileEntry(String path, String state)

Once again the ‘state’ parameter is used. Except in this case it is only providing what was given to the c:zipfile component initially, it cannot be changed. Instead the method returns the Base64 encoded data of the requested zip file entry.

Finally the ‘oncomplete’ attributes JavaScript is executed and the resulting data is passed back (via apex:ActionFunction) to the controller for storage (not the binding in this case is transient, always good to avoid Viewstate issues when receiving large data) and redirects the user to the resulting Document page.

Summary

Neither components are currently giving realtime updates on the zip entries they are processing, so as per the messaging in the demo pages the user has to wait patiently for the next step to occur. Status / progress messages is something that can easily be implemented within the components at a future date.

These components utilise some additional components, I have not covered. Namely the c:zip and c:unzip components. If you have been following my expliots in the Apex Metdata API you may have noticed early versions of these in use in those examples, check out the Deploy and Retrieve demos in that repo.

I hope this short series on Zip file handling has been useful to some and once again want to give a big credit to the JSZip library. If you want to study the actual demo implementations more take a look at the links below. Thanks and enjoy!

Links

49 thoughts on “Handling Office Files and Zip Files in Apex – Part 2

  1. Pingback: Handling Office Files and Zip Files in Apex – Part 1 | andrewfawcett

  2. Hi Alex,
    I am getting below issue while upload the file via Unzip via Static Resource Tab:
    Issue is:
    System.CalloutException: IO Exception: Unauthorized endpoint, please check Setup->Security->Remote site settings. endpoint = https://c.na15.visual.force.com/services/Soap/m/25.0
    Error is in expression ‘{!upload}’ in component in page importstaticresource
    Class.MetadataService.MetadataPort.listMetadata: line 1498, column 1
    Class.ImportStaticResourceController.upload: line 46, column 1
    Could you plz help me for the above issue

    Thanks,
    Vinay Chada

  3. Thanks for your update..
    I have added this url: https://c.na15.visual.force.com/services/Soap/m/25.0 in Remote Site but still i am getting same issue.
    Plz help me on this..

    Thanks,
    Vinay Chada

  4. my error is
    System.VisualforceException: 404 status code return from request to http://na15.salesforce.com/resource/upload005i0000000gle0AAA/word/document.xml?inline=1
    Error is in expression ‘{!checkAsyncRequest}’ in component in page importstaticresource

    Class.ImportStaticResourceController.checkAsyncRequest: line 97, column 1

    • The sample code is design to work with docx files.

      Try changing this line in ImportStaticResourceController.cls

      PageReference contentTypes = new PageReference(‘/resource/’ + RESOURCE_NAME + ‘/word/document.xml’);

      To reference the file from the xlsx file you want to access (remember you can rename the file extension to .zip to see inside it)

      For example /xl/worksheets/sheet1.xml would be…

      PageReference contentTypes = new PageReference(‘/resource/’ + RESOURCE_NAME + ‘/xl/worksheets/sheet1.xml’);

      Hope this helps!

      • Also this post refers to the alternative way to access and create Open Office files, the above response applies to the approach described in my Part 1 post.

  5. Hi Alex,

    Thanks for your update..
    It’s working fine after using PageReference contentTypes = new PageReference(‘/resource/’ + RESOURCE_NAME + ‘/xl/worksheets/sheet1.xml’);

    Thanks lot for your help…

    After creating the .Xml file(using Unzip via Static ResourceOr Unzip Demo) how to insert the xml file data into custom object, plz suggest me for this..

    My target is insert the xlsx file data into Custom objects..

    Thanks,
    Vinay Chada

    • via Part 1 Approach

      If you are obtaining just the sheet data file (e.g. a specific part of the XLSX file) then you can continue to follow part 1’s approach.

      In ImportStaticResourceController find the following line, then either store it or use the XML features of Apex to parse it (see Apex Developers Guide)

      UnzippedFile = contentTypes.getContent().toString();

      via Part 2 Approach (more flexible)

      If you want to store all the files in the XLSX (for later processing), you need to perhaps be looking at the sample code from part 2.

      This already expands out all the files within the XLSX file as Attachments, though this can be changed, see this method

      UnzipDemoController.receiveZipFileEntry

  6. Hi,
    After generating the XL/sharedstring.xml file we are not ale to get the duplicate values, we need hole data(with duplicate data) in XML format.
    Plz help me on this, its bit urgent..
    Thanks for advice…

    • I am not fully aware of the specification of the Open Office format that Microsoft uses, you may have to research this more to understand where the data you need is stored. I would recommend using the approach shown in Part 2 if you need information from multiple parts of the XLSX file. The sample code already outputs all parts as attachments, once this expansion is done you can write Apex code to read the attachments as per your needs. Or as per my earlier reply override the UnzipDemoController.receiveZipFileEntry method to store in custom objects as you desire. Hope this helps!

  7. Hi Andy.
    Your article was great help.

    is there anyway we can take the count of all the child nodes belonging to a parent folder
    I am currently looping through it and finding the same as bellow

    var file = evt.target.files[0];
    var reader = new FileReader();
    var str2;
    var cnt=0;
    reader.onloadend = function( evt ) {
    var zip = new JSZip(evt.target.result.split(“,”)[1], {base64:true});

    var zipFileNames = [];

    for(var zipfileName in zip.files)
    cnt++;
    ……………………….

    is there a simpler way like zip.files.count() or something ?

    Thanks in advance,

    Sandeep

  8. Hi Andrew,
    Thanks for this solution. I have implemented the same.
    I am getting an error “Unrecognized base64 character: <" while uploading the file.
    It is creating the object record but not attaching the files.

    the error is coming on this line in class "attachment.Body = EncodingUtil.base64Decode(data);"

    I am not versed with base64 decoding or encoding. Could you throw some light on the same.
    Greatly appreciate your time.

  9. How do I implement the ZIP/UNZIP APEX files on Salesforce?

  10. I am trying to import large Excel workbooks and am getting the error “Input too long [1,125].” The numbers change depending on the size of the file. Is there a way to overcome this?

  11. How do I load an attached file? The browse button only looks in a directory, you don’t have the option to look at an attachment and the object hasn’t been created until the unzip process is complete.

    • You can query for the Attachment object, there are plenty of examples online

      • Is this the line to change?

        or should I change it in the UnzipDemoController.cls?

        There’s not a lot of info on the CUnzip component

        So I created a new object and attached a large xlsx file to it.

        I can’t grasp how I can tell the CUnzip component to read the attached file. Obviously I have to provide the URL to the file, I cannot find out how to do this.

      • You have to soql the attachment object and expose the contents of the body field to the JavaScript

  12. I’m using the unzipdemo to upload files

  13. Sorry, I meant is this the line to change: “”

  14. c:unzipfile name=”somezipfile” oncomplete=”unzipped(state)

  15. Hi Andrew, Thanks for the article. Does it work for IE8?

  16. This works great for a simple excel file however ones with images or more data fail. Is this correct? I am working on something that would require no user interaction and work completely in Salesforce. I don’t see a way to edit an excel file to remove columns to allow the excel file to complete. Any suggestions?

    • Unfortunately you need a ui process to unpack the file, there is a browser limit on file size to consider and so images in the file will for sure stress this. Your other option might be best to move the processing to Heroku app and call out to a Rest API in that app from Apex.

  17. Not able find the button in ur code but i see the choose file button in the screen and i cant find the onreceive parameter in the component .Can u help me here

  18. Not able to find the code for button in unzip function but i can see the button in the screen shot and missing onreceive parameter in the component

  19. can any explain below line
    TODO: Need to scope these global var’s by component name
    var zipData = ”;
    do we need modify above line???

    • No it’s just a variable declaration. The comment means that if you have multiple zip VF components the variable will be shared which would not be desirable. Though its unlikely you would have more than one on a page tbh

      • Hi Ande,

        when try unzipping using ur code it just loads the file and redirect to home page ,file is not storing in notes and attachment related list.
        I have implemented the same code don’t no why remote function is not called
        Do I need to do any modification in your code ???
        Thanks
        Kiran

  20. can any explain below line
    TODO: Need to scope these global var’s by component name
    var zipData = ”;
    do we need modify above line???

  21. Hello Andrew,

    I have a scenario where I import an Excel (.xlsx) file and need to send that file to another system and then the other system responds with some set of results. I need to store the results in the same excel which was uploaded in a different sheet.
    So is there a possibility to add an extra sheet to an already unzipped file?
    Or is it possible to merge to different Excel files and zip them as only 1 excel file?

    • Sure that is possible, study the xlsx spec on this, I believe it’s a new file for each sheet

      • Hello Andrew,

        When I try to read the sheet1.xml(unzipped) I am unable to read the blob/string of body so I am unable to store in objects or attach another sheet2, if you have specific example or code sample which I can look to understand how the code creates so many attachments of one single excel and how to read each attachment then it can be helpful for me.

      • If you look at part 2 of my blog article on zip and unzip you can see examples that do both. You can implement handlers to receive the files, maybe store in attachments, then create your new file and stream the attachments plus your new file back into a new zip.

  22. Hi I tried with the code provided, but in the parsed XML I cannot see the data of my xlsx file. It shows some random numbers in the data. Am I missing something?

    • Can you share some more details? Maybe a Gist with code snippets and what you are seeing?

      • XLSX row and column values are stored in the sharedString.xml (different file). This is done so that repeating values can be stored and referenced from one place. The main document refers to the index of the value in sharedString.xml. Hope this solves the issue.

      • Thanks for sharing

Leave a comment