How to Convert HTML to PDF in Java using PrizmDoc API

Accusoft’s PrizmDoc is a powerful toolkit for displaying, converting, annotating, and processing documents. This example will demonstrate how to use the PrizmDoc Cloud Server to upload a HTML document and convert it to PDF format, using the Java programming language.

The HTML To PDF Conversion Process

Converting a file with PrizmDoc Server takes 4 distinct phases:

Upload: We start by uploading the file to the PrizmDoc server (it’s only stored temporarily while the conversion takes place). PrizmDoc will return a JSON file with the internal file ID and an affinity token, which we’ll need for the next step.
Convert: Next, we ask PrizmDoc to convert the uploaded file by using the file ID returned from the previous step. In this demo, we’ll just be converting an HTML file to PDF – but PrizmDoc can also handle 50+ other file formats. PrizmDoc will now provide a ProcessID which we’ll need for the next step.
Status: We need to wait until the conversion process is complete before trying to download the file. The API provides the current status of the conversion so we can download the file once it is ready.
Download: Once it’s ready, we can now download the converted file.

Our Java program will reflect each of these steps through a class called PrizmDocConverter, that will interface with the PrizmDoc Cloud Server for each of the phases.

PreRequisites

For our code to work, we’ll need a few things:

A PrizmDoc API key. This code example uses the PrizmDoc cloud server – sign up for a free evaluation API key here.
Java Development Kit. This code was written for Java SE Development Kit 8, available through Oracle.
JSON reader: PrizmDoc server uses JSON ( https://www.json.org/ ) to transfer data between the client and server. For the demo, we’ll be using the JSON Simple package. It can be downloaded and compiled through the Git project ( https://github.com/fangyidong/json-simple ), or if you’re using Maven download it as a Maven project ( https://mvnrepository.com/artifact/com.googlecode.json-simple/json-simple ).
PrizmDoc server API credentials – sign up for free. The API key is provided after login:

Take the API key, and place it into a file titled “config.json” at the root of your project:

{ “apiKey”: “valid key here” }

Replace “{{ valid key here }}” with your API key, so if your API key is “ur7482378fjdau8fr784fjklasdf84”, it will look like this:

{ “apiKey”: “ur7482378fjdau8fr784fjklasdf84″ }

Now we can code!

HTML To PDF Java Sample Code

Click here to download the code files we’re using in this example. You can take these files and run with them, or keep reading for a detailed walk-through of the code.

You’ll also need to get your free evaluation API key here.

PrizmDocConverter Class

The PrizmDocConverter class uses 4 methods, corresponding to the 4 phases of PrizmDoc file conversion. The first part is to start up the PrizmDocConverter class.

Step 0: Loading the PrizmDocConverter Class

The first thing to do is to start up our PrizmDocConverter class. Using the Simple JSON package, we can get the apiKey field from our file:

public static String getAPIKey() { String key = null; try { JSONParser parser = new JSONParser(); Object obj = parser.parse(new FileReader(“config.json”)); JSONObject readConfig = (JSONObject) obj; // now get the API key key = (String) readConfig.get(“apiKey”); } catch (IOException e) { e.printStackTrace(); System.out.println(e); } catch (ParseException e) { e.printStackTrace(); } return key; }

We can now use our getAPIKey method to fire up the PrizmDocConverter class with the following call:

PrizmDocConverter newDocUpload = new PrizmDocConverter(getAPIKey());

With the class started, we can start the next phase: Upload.

Step 1: File Upload

The PrizmDocConverter uses the method WorkFileUpload, that takes in the file name of the file we’re uploading and processes it using the Prizmdoc Work Files API:

public String WorkFileUpload(String uploadFile)

The Work Files API takes in an uploaded file, and temporarily stores it for processing, so it is not available to other users through the Portal interface. The URL for the PrizmDoc Cloud Server for uploading Work Files is:

https://api.accusoft.com/PCCIS/V1/WorkFile

The Work File API uses the POST command to upload the file, and requires the following headers:

Acs-api-key: Our API key. (If using OAuth instead, then follow the Work File API instructions instead: http://help.accusoft.com/SAAS/pcc-for-acs/work-files.html#get-data ).
Content-Type: This should be set to application/octet-stream since we’re uploading bytes.

Start by setting up the URL and the HTTP headers. This example will use the HttpsURLConnection object to make the secure connections:

try { URL workFileURL = new URL(“https://api.accusoft.com/PCCIS/V1/WorkFile”); HttpsURLConnection workFileConnection = (HttpsURLConnection) workFileURL.openConnection(); workFileConnection.setRequestMethod(“POST”); workFileConnection.setRequestProperty(“acs-api-key”, getAPIKey()); workFileConnection.setRequestProperty(“Content-Type”, “application/octet-stream”); }

To upload the file, we have Java load the file into a FileInputStream, then read it through our HTTPSURLConnection:

// Send post request workFileConnection.setDoOutput(true); wr = new DataOutputStream(workFileConnection.getOutputStream()); // open our file // read it into our stream and upload it inputStream = new FileInputStream(uploadFrom); int bytesRead = -1; while ((bytesRead = inputStream.read()) != -1) { wr.write(bytesRead); } wr.flush(); inputStream.close(); wr.close();

When the file is uploaded, the Work Files API returns a JSON file with the details of the uploaded file:

{ “fileId”: { work file id }, “fileExtension”: { extension of uploaded file }, “affinityToken”: “{ value used to identify which machine in the cluster this work file resides on. }” }

To process this file after it’s uploaded, we need the fileId and the affinityToken for the next phase: Convert.

Step 2: Convert

With the file on the PrizmDoc server, we can now issue a convert request through the Content Conversion API with our ContentConvert method:

public String ContentConvert(String affinityToken, String fileId, String format)

We use the affinityToken and fieldId we received in the previous Upload step. The format based on the versions supported by the Content Conversion API ( http://help.accusoft.com/PrizmDoc/v13.1/HTML/webframe.html#supported-file-formats.html ). In this case, we’ll use “pdf” as our conversion target.

This requires we upload a JSON file with the fileId and what we want to convert it to:

{ “input”: { “sources”: [ { “fileId”: “{ fileId from Work API}” } ], “dest”: { “format”: “pdf” } } }

Note that it’s possible to request multiple files be converted, so if we were to upload a series of files, we could mass request conversions of all of them. With the fileId and affinityId extracted, we can start our request with the following URL for the Content Conversion API:

https://api.accusoft.com/PCCIS/V1/WorkFile

The following headers are required:

Content-Type: The type of file, in this case “application/json”
Accusoft-Affinity-Token: The Affinity Token we received from the Work File Upload
Acs-api-key: The API key

To submit our request, we’ll construct our JSON file based on the fileId and affinityId like this:

// create the JSON object JSONObject fileIds = new JSONObject(); fileIds.put(“fileId”, fileId); // the formats object JSONObject formats = new JSONObject(); formats.put(“format”, format); // if we had more then one file we could put them into the array of fileIds JSONArray fileIdArray = new JSONArray(); fileIdArray.add(fileIds); // the source object JSONObject sources = new JSONObject(); sources.put(“sources”, fileIdArray); sources.put(“dest”, formats); // now the input object JSONObject input = new JSONObject(); input.put(“input”, sources); String uploadRequestJSON = input.toJSONString();

Now we can upload our JSON string for the conversion request:

// upload the JSON request and get the result // Send post request workFileConnection.setDoOutput(true); DataOutputStream wr = new DataOutputStream(workFileConnection.getOutputStream()); wr.writeBytes(uploadRequestJSON); wr.flush(); wr.close();

Getting the JSON file is the same as our WorkFileUpload method, and with it we’ll be able to download our converted file when it’s ready. Here’s the format of the JSON file our Content Conversion request provides:

{ “input”: { “sources”: [ { “fileId”: “{the submitted file ID}”, “pages”: “” } ], “dest”: { “format”: “pdf”, “pdfOptions”: { “forceOneFilePerPage”: false } } }, “expirationDateTime”: “2015-12-17T20:38:39.796Z”, “processId”: “{the returned process ID}”, “state”: “processing”, “percentComplete”: 0 }

The most important fields are the processId and the state, which we’ll take up in the next stage, Status.

Step 3: Status

Until the conversion request is complete, we can’t download the PDF file. To check the status, we use the same Content Conversion API call, with one difference: we use the processId as part of the URL:

https://api.accusoft.com/v2/contentConverters/{processId}

This is very important – this is a GET command, but don’t try to use processId as a variable. Your URI will not look like this:

https://api.accusoft.com/v2/contentConverters?processId={yourProcessId}

It must look like this:

https://api.accusoft.com/v2/contentConverters/{processId}

For example, a processID of ahidasf7894 would give use a URI of:

https://api.accusoft.com/v2/contentConverters/ahidasf7894

With that, let’s look at our ConvertStatus method, which will query the PrizmDoc server via the Content Conversion API. This API will return another JSON file for us to parse:

public String ContentConvertStatus(String affinityToken, String processId)

This request is GET, with the following headers:

Accusoft-Affinity-Token
acs-api-key

Once again we get a JSON file back. There are three possible states:

“processing” – The conversion is still in progress
“complete” – the conversion is done, and we can download the converted file
“error” – something has gone wrong with the conversion.

If the conversion is still processing, the “state” field will read “processing.” When complete, the “state” field reads “complete”. Our code puts in a loop that checks every 10 seconds, and when “state” is complete, we’ll get a JSON file that contains a new fileId – the one we want to download. In the event of an error, our program will print out the JSON file and exit:

while (!convertState.equals(“complete”)) { if(convertState.equals(“error”)) { System.out.println(“There has been an error in conversion: “); System.out.println(response); System.exit(0); } try { TimeUnit.SECONDS.sleep(10); } catch (InterruptedException e) { e.printStackTrace(); } response = “”; response = newDocUpload.ContentConvertStatus(uploadAffinityToken, convertProcessId); convertState = getConvertState(response); System.out.println(“ConvertState: n”); System.out.println(convertState); }

Here’s an example of the JSON file we’re looking for when the status is complete:

{ “input”: { “sources”: [ { “fileId”: “ek5Zb123oYHSUEVx1bUrVQ”, “pages”: “” } ], “dest”: { “format”: “pdf”, “pdfOptions”: { “forceOneFilePerPage”: false } } }, “expirationDateTime”: “2015-12-17T20:38:39.796Z”, “processId”: “ElkNzWtrUJp4rXI5YnLUgw”, “state”: “complete”, “percentComplete”: 100, “output”: { “results”: [ { “fileId”: “{ the converted fileId }”, “sources”: [{ “fileId”: “{ submitted fileId }”, “pages”: “1-3” }] } ] } }

We want to extract the fileId from the “output” node. Now we can get to the last phrase: downloading our file.

Step 4: Download

We’ll use the Work Files API again, only now we’ll use our new fileId to put in a GET request with the following URL:

https://api.accusoft.com/PCCIS/V1/WorkFile/{fileId}

Our Work Files API to download will work via the following method:

public String WorkFileDownload(String affinityToken, String fileId, String output)

Our affinityToken we have from the initial WorkFileUpload process, and the fileId we just received from our ContentConvertStatus. The last is the name of the file to write out. In this case, it will be a PDF file, so make sure you have the file name set correctly. Downloading a file is nearly an inversion of our original WorkFileUpload request:

ReadableByteChannel rbc = Channels.newChannel(workFileConnection.getInputStream()); FileOutputStream convertedFile = new FileOutputStream(output); convertedFile.getChannel().transferFrom(rbc, 0, Long.MAX_VALUE); convertedFile.flush(); convertedFile.close(); rbc.close();

And with that – the conversion process is complete.

The Conversion Example

In our sample program, the conversion is called via a command line with two arguments: the file to be uploaded, and the name of the file to write the converted file to. In this case, our PrizmDocConverter class is wrapped up in a .jar called “PrizmDocConverter.jar” that contains our PrizmDocConverter class, and we make sure our execution is linked to our JSON Simple API to read the JSON files:

Replace the json-simple-1.1.1.jar with your version. With this call, simple.html will be converted to simple.html.pdf. Want to try it out for yourself? Get your free API key then download our sample project and test our the Accusoft PrizmDoc conversion system.

Featured Content

Video: Exploring the Future of AI for Improved Document Management for your ECM

Featured Content

eGuide: Digital Transformation

SDK Technologies

Accusoft SDKs

Featured Content

Improve Form Processing Text Recognition Results with Regular Expressions

Experience Docubee

Featured Content

Meet Docubee

How to Convert HTML to PDF in Java using PrizmDoc API

The HTML To PDF Conversion Process

PreRequisites

HTML To PDF Java Sample Code

PrizmDocConverter Class

Step 0: Loading the PrizmDocConverter Class

Step 1: File Upload

Step 2: Convert

Step 3: Status

Step 4: Download

The Conversion Example