Convert HTML to PDF in C# with PrizmDoc API

Project Goal:
Create a command line program in C# that can upload a HTML file to the PrizmDoc cloud server, issue a conversion command, and download the converted PDF file upon completion.

Project Goal:

C# development environment (Visual Studio used in this example)
PrizmDoc Cloud account (signup for a free account here)

Need to programmatically convert HTML into PDF files? Going from one format to the next can be troublesome, and maintaining the correct layout and styles in the final document is often a complex challenge.

PrizmDoc API handles file type conversions smoothly and with no hassle. No installation, no servers to set up. Just use the API to upload a file and have it converted for you. For this demonstration, we’ll be creating a C# program that just requires the Newtonsoft.JSON package (which can be acquired via NuGet).

PreRequisites

Before we get started, there’s a few steps to take. First, register for your own Accusoft PrizmDoc Cloud services account. Your account will come with a free trial.

Once your account is set up, copy the API key. This will be used to authenticate your requests to the PrizmDoc server:

This demo will require the use of Newtonsoft.json package. If we’re using Visual Studio, just add this package to your project via the NuGet Package Manager:

The HTML To PDF Conversion Process

Converting a file using PrizmDoc servers is a 4 step process:

1. Upload: Upload the file to the server. We’re not making this a permanent part of the file portal – just putting it in temporarily while PrizmDoc does the work. When the file is uploaded, PrizmDoc returns a JSON response with internal file ID and an affinity token – we need both for the next step, which is conversion. See the PrizmDoc documentation on the Work Files API for more details ( http://help.accusoft.com/SAAS/pcc-for-acs/work-files.html#get-data ) .

1. Convert: Issue the conversion command on the uploaded file. Using the file ID returned from the Upload stage, our software orders PrizmDoc to convert that file to one of the supported file formats listed on the PrizmDoc documentation page. In this demo, we’ll just be converting an HTML file to PDF – but this program can be modified for a number of other file formats. Once a conversion process has started, the Conversion API provides a ProcessID that can be used to check the status of the conversion process.
  The Content Conversion API documentation provides more details on how the process works ( http://help.accusoft.com/SAAS/pcc-for-acs/webframe.html#content-conversion-service.html ) .

1. Status: Until the conversion process is complete, we don’t want to try to download a file that doesn’t exist. The Content Conversion API as listed above also provides the current status of the file conversion with a ProcessID. Once the file conversion is complete, the Content Conversion API provides a new FileID of the converted file.

Download: Using the same Work File API with our new File ID and affinity token, we can now download the file.

And with that, let’s get coding!

HTML to PDF C# Code Sample

Here’s the C# code you’ll need to convert HTML to PDF using PrizmDoc. You can jump right into the code and run with it, or keep scrolling for a detailed walk through of the code and how it works.

Download Zip File

HTML to PDF C# Code Analysis

Step 0: Setting Up Our Class

In the sample project, there are a few housekeeping items to work through. The first is setting up the use of our API key. If you’ve downloaded our sample project, then edit the file config.json and insert your PrizmDoc API key like this:

{ “apiKey”: “jgu4uu58gnvkadfufnhkdahuifasdiasdfhireu9rnj” }

In our sample project, the actual work is being driven by our PrizmDocConvert class. The first thing we do when generating the class is set the API key:

private string _apiKey; //start our class and make sure that the apikey exists public PrizmDocConvert(string apiKey) { if (String.IsNullOrEmpty(apiKey)) { throw new ArgumentException(“Missing parameter apiKey”, “apiKey”); } _apiKey = apiKey; }

In our sample program, the API Key is read from our config.json file:

static void LoadConfig() { using (var reader = new StreamReader(“config.json”)) { dynamic config = JObject.Parse(reader.ReadToEnd()); ApiKey = config.apiKey; } }

Step 1: Upload

The first part of our process is to Upload our file to the PrizmDoc server. Our member function really just needs one thing – the name of the file to upload, and it will return a JSON.Net JObject with the results:

public async Task<JObject> Upload(string fileToUpload)

The Work File API uses the POST command to upload the file, and requires the following headers:

Acs-api-key: Our API key. (If using OAuth instead, then follow the Work File API instructions instead: http://help.accusoft.com/SAAS/pcc-for-acs/work-files.html#get-data ).
Content-Type: This should be set to application/octet-stream since we’re uploading bytes.

The method uses the WebClient object to perform our uploading. So we set up the Work File API URI, and set our headers:

var endpoint = new Uri(“https://api.accusoft.com/PCCIS/V1/WorkFile”); using (var client = new WebClient()) { client.Headers.Add(“acs-api-key”, _apiKey); client.Headers.Add(“Content-Type”, “application/octet-stream”);

Then, read our file into a Byte object, and upload it to the PrizmDoc server using a POST command. The Work API will return a JSON response, which we can then parse for the next steps. In this case, we’ll upload an HTML file:

using (var reader = new BinaryReader(input.OpenRead())) { var data = reader.ReadBytes((int)reader.BaseStream.Length); var results = await client.UploadDataTaskAsync(endpoint, “POST”, data); string getResult = “”; getResult = Encoding.ASCII.GetString(results); return JObject.Parse(getResult);

Step 2: Convert

Our uploaded file is sitting on the PrizmDoc server, all happy and content. But we want more than that – we want to convert it! In the last phase, the Work API returned a JSON file to us. Let’s peek inside:

{ “fileId”: “Xe6zv3dH0kVSzLuaNhd32A”, “fileExtension”: “html”, “affinityToken”: “ejN9/kXEYOuken4Pb9ic9hqJK45XIad9LQNgCgQ+BkM=” }

If you’re familiar with JSON, then this should be simple enough. In our test application, we can snag the two fields we care about (the fileId and the affinityToken) with our JSON.Net objects. Our sample code uses another function to call our Upload method, but the final result is a JSON object with the Work File API results:

JObject uploadResults = RunUpload(input).Result; string fileID = (string)uploadResults.SelectToken(“fileId”); string affinityToken = (string)uploadResults.SelectToken(“affinityToken”);

Now it’s time for the Content Conversion API to come into play. This API call requires a JSON file be posted with the details of what file is to be converted.

In this case, we’ll use our new method Convert to utilize that API with three parameters: the affinityToken, the fileID, and the format to convert to (which defaults to pdf):

public async Task<JObject> Convert(string affinityToken, string fileID, string format=”pdf”)

We already covered how to add the headers to the WebClient object, so we won’t belabor the point. These are the required headers:

Content-Type: The type of file, in this case “application/json”
Accusoft-Affinity-Token: The Affinity Token we received from the Work File Upload
Acs-api-key: The API key

And our new URI will be “https://api.accusoft.com/v2/contentConverters”.

Our Convert request is in the form of a JSON file. We could submit requests for multiple files to be converted, but this demo will just focus on the one. Here’s the format of the JSON request:

{ “input”: { “sources”: [ { “fileId”: “{ fileId from Work API}” } ], “dest”: { “format”: “pdf” } } }

Our method generates the JSON file this way.

JObject tester = new JObject( new JProperty(“input”, new JObject( new JProperty(“sources”, new JArray( new JObject( new JProperty(“fileId”, fileID) ) ) ), new JProperty(“dest”, new JObject( new JProperty(“format”, format) ) ) ) ) );

There’s a lot of whitespace here, but it helps make clear what nodes of the JSON file belong to which parts.

All that’s left is to POST the JSON string:

string results = await client.UploadStringTaskAsync(endpoint, “POST”, tester.ToString()); return JObject.Parse(results);

The conversion process is started! But until it’s complete, we can’t download our completed file. So we need to request the Status until the conversion is complete.

For that, we need to know what the ProcessID is. The Content Conversion API returns a JSON file, of which the most important thing we need is the processID:

{ “input”: { “sources”: [ { “fileId”: “{the submitted file ID}”, “pages”: “” } ], “dest”: { “format”: “pdf”, “pdfOptions”: { “forceOneFilePerPage”: false } } }, “expirationDateTime”: “2015-12-17T20:38:39.796Z”, “processId”: “{the returned process ID}”, “state”: “processing”, “percentComplete”: 0 }

Step 3: Status

The Content Conversion API allows us to track the status of a conversion using the ProcessId returned from our Convert request. The URL is the same, with one major difference:

https://api.accusoft.com/v2/contentConverters/{processId}

This is very important – this is a GET command, but don’t try to use the WebClient QueryString.Add member to add the processId. Your URI will not look like this:

https://api.accusoft.com/v2/contentConverters?processId={yourProcessId}

It must look like this:

https://api.accusoft.com/v2/contentConverters/{processId}

For example, a processID of ahidasf7894 would give use a URI of:

https://api.accusoft.com/v2/contentConverters/ahidasf7894

With that, let’s look at our ConvertStatus method, which will query the PrizmDoc server via the Content Conversion API. This API will return another JSON file for us to parse:

public async Task<JObject> ConvertStatus(string processId, string affinityToken) { ///v2/contentConverters/{processId} string endpoint = “https://api.accusoft.com/v2/contentConverters/” + processId;

Our headers in this case are similar to before:

Accusoft-Affinity-Token
acs-api-key

That’s it – since this is just a GET, and the Content Conversion API is getting the processId from the URI we request from, we just need to submit the request and return the JSON file:

string results = await client.DownloadStringTaskAsync(endpoint); return JObject.Parse(results);

Here’s a sample JSON file that will be returned:

What we’re looking for is for the “state” field to return “complete”. The sample code uses a 30 second wait timer after each Convert Status request. Once it sees the conversion is complete, it snags the new fileId that will be used to download the file:

while (!(convertStatus.Equals(“complete”))) { System.Threading.Thread.Sleep(30000); convertStatusresults = RunConvertStatus(processId, affinityToken).Result; convertStatus = (string)convertStatusresults.SelectToken(“state”);

And here’s the JSON file that returns the result we need:

{ “input”: { “sources”: [ { “fileId”: “ek5Zb123oYHSUEVx1bUrVQ”, “pages”: “” } ], “dest”: { “format”: “pdf”, “pdfOptions”: { “forceOneFilePerPage”: false } } }, “expirationDateTime”: “2015-12-17T20:38:39.796Z”, “processId”: “ElkNzWtrUJp4rXI5YnLUgw”, “state”: “complete”, “percentComplete”: 100, “output”: { “results”: [ { “fileId”: “KOrSwaqsguevJ97BdmUbXi”, “sources”: [{ “fileId”: “ek5Zb123oYHSUEVx1bUrVQ”, “pages”: “1-3” }] } ] } }

Perfect. All that’s left is to extract the fileId:

string newFileID = (string)convertStatusresults.SelectToken(“output.results[0].fileId”);

Note that a Conversion Request can generate multiple files. For example, converting a multipage document into a series of image files. For more details, see the Content Conversion API documentation.

All that’s left now is to download our converted file with the Work File API.

Step 4: Download

We’re back where we started – the Work File API. This time, instead of uploading a file, we’re going to download the processed file. The URI is the same as before, with one change – the fileId of the file we’re downloading:

https://api.accusoft.com/PCCIS/V1/WorkFile/{fileId}

For our Download Work File method, we set up the method this way. The only new parameter is the outfile – this will be the name of the file we save the converted file to:

public async Task DownloadWorkfile(string affinityToken, string workFile, string outfile)

We use the same headers as in our Convert Status request:

Accusoft-Affinity-Token
acs-api-key

Since this is a Get command, we just submit the request and pipe the results out to our file:

//now, pipe those results back out to a file FileInfo output = new FileInfo(outfile); using (var writeStream = output.Create()) { var results = await client.DownloadDataTaskAsync(endpoint); await writeStream.WriteAsync(results, 0, results.Length); }

And with that, our program is complete! Here’s the results. In this case, the method calls are set to be verbose to track progression, but feel free to edit the program to your liking:

PrizmDocConvertHTMLtoPDF.exe sample.html sample.html.pdf

The converted document will retain images, lists, links, and other formatting and elements:

This program sample is just converting a simple HTML file – but if we were converting something like a Microsoft Word document, the results would be the same – a perfectly created PDF file. In fact, PrizmDoc uses Microsoft Word to process the Word Document conversion to have the highest possible fidelity.

Featured Content

Video: Exploring the Future of AI for Improved Document Management for your ECM

Featured Content

eGuide: Digital Transformation

SDK Technologies

Accusoft SDKs

Featured Content

Improve Form Processing Text Recognition Results with Regular Expressions

Experience Docubee

Featured Content

Meet Docubee

Convert HTML to PDF in C# with PrizmDoc API

PreRequisites

The HTML To PDF Conversion Process

HTML to PDF C# Code Sample

HTML to PDF C# Code Analysis

Step 0: Setting Up Our Class

Step 1: Upload

Step 2: Convert

Step 3: Status

Step 4: Download