Technical FAQs

Question

How do I remove XMP Data from my image using ImageGear .NET?

Answer

When removing XMP data in ImageGear, the simplest way to do this is to set the XMP Metadata node to null, like so:

ImGearSimplifiedMetadata.Initialize(); 
doc.Metadata.XMP = new ImGearXMPMetadataRoot();

Or, you can traverse through the metadata tree and remove each node from the tree:

// Example code. Not thoroughly tested
private static void RemoveXmp(ImGearMetadataTree tree)
{
ArrayList toRemove = new ArrayList();
foreach (ImGearMetadataNode node in tree.Children)
{
    if (node is ImGearMetadataTree)
        RemoveXmp((ImGearMetadataTree)node);

    if (node.Format != ImGearMetadataFormats.XMP)
        continue;

    toRemove.Add(node);
}

foreach (ImGearMetadataNode node in toRemove)
    tree.Children.Remove(node);
}
Question

What quality should my images be for processing form data and recognition using FormSuite?

Answer

In all cases, you want to have your images as clear and as clean as possible. For any particular procedure, please consider the following:

OCR and ICR: Capture images in at least 300 DPI resolution. Ideally, working in black and white allows the objects of interest on your image to be better defined and recognized. Free the image form all noise as much as possible. As if a human were reading it, you want the text objects on the image to be as legible as possible. For ICR, ensure that the characters are printed (no cursive text, etc).

Barcode recognition: As with OCR and ICR, capture images in at least 300 DPI and working with black and white content can provide excellent results. Ensure that the bars in the barcodes are clearly defined on the image and are not malformed (for example, the barcodes should have the proper start and stop sequence, etc). Clear as much noise from the image as possible.

Forms matching and registration: As with the prior 2 items above, capture your documents in at least 300 DPI. Ensure that your resolution is consistent between your form templates and incoming batch images. Form templates should only contain data that is common to every image that is being processed (i.e. Form fields, the text that appears on the blank form itself, etc). The template should not have filled-in field information as this will affect the forms matching process.

Question

What are the best quality images to use when processing form data and recognition?

Answer

In all cases, you’ll want to have your images as clear and as clean as possible. For any particular procedure, please consider the following: OCR and ICR: Capture images in at least 300 DPI resolution. Ideally, working in black and white will allow the objects of interest on your image to be better defined and recognized. Free the image form all noise as much as possible. As if a human was reading it, you’ll want the text objects on the image to be as legible as possible. For ICR, make sure that the characters are printed (no cursive text, etc). Barcode recognition: As with OCR and ICR, capture images in at least 300 DPI and working with black and white content can provide excellent results. You’ll also want to make sure that the bars in the barcodes are clearly defined on the image and are not mal-formed (for example, the barcodes should have the proper start and stop sequence, etc). As always, clear as much noise from the image as possible. Forms matching and registration: As with the prior 2 items above, capture your documents in at least 300 DPI. Make sure that your resolution is consistent between your form templates and incoming batch images as well. Form templates should only contain data that is common to every image that is being processed (i.e. – Form fields, the text that appears on the blank form itself, etc). The template should not have filled-in field information as this will affect the forms matching process.

form workflow automation

Forms have long been used to provide organizations with important information about their customers. For a financial services or insurance company, that information might be used to determine eligibility for a loan or set a policy rate. Legal teams and healthcare providers, on the other hand, often use them to quickly gather information that could be relevant to a client’s case or a patient’s care. By building form workflow automation into their applications, developers can provide these organizations with the tools they need to improve efficiency and provide better service to their customers.

A Better Way to Capture Data with Form Workflow Automation

At its core, a forms workflow is designed to capture data from completed forms and route that information to the appropriate destination. That end point will vary based on the application. In some cases it could be used to autopopulate database entries. Other systems may feed it into machine learning algorithms to identify trends or provide predictive insights. Before any of that can happen, however, automated workflows with forms recognition capabilities need to be in place to identify various form types and extract information from them using various forms of optical recognition.

The primary benefits of workflow automation are speed and accuracy. By building a forms workflow within their applications, developers can help their customers process submitted forms much more efficiently than they could by hand. Even if manual data entry wasn’t so prone to human error, it would still be a waste of valuable resources to have skilled employees performing such a repetitive, routine task. Automating this sort of work is often the first step in maximizing performance in other areas of an organization because it frees up resources that can be directed toward higher-value tasks.

Say Goodbye To Paper (Mostly)

Organizations have talked about going “paperless” for decades, but they frequently find it much more difficult to do so in practice. That’s largely because physical forms continue to be used across many industries. Converting these paper forms into digital format as quickly as possible is critically important. Without some way of incorporating them into an automated workflow, inefficiencies and manual errors will continue to creep back into business processes. 

A forms workflow needs to be able to handle scanned forms images in addition to purely digital documents. Robust forms identification tools are essential for this process because they have the ability to match any submitted form to a library of predefined templates. Without identification capabilities, applications would need to be given specific information about every form. At best, submitted forms would need to be manually presorted before they could be scanned and uploaded for processing rather than being converted into digital format all at once and identified automatically.

Recognition and Extraction

Once forms are scanned, uploaded, and identified into an application, the data capture process can begin. While digital forms can easily send information contained in fields to the proper destination, a scanned form is just a static document image. Even if the form was filled out digitally and never existed as a paper document, the fields may not be responsive or the entire form may be nothing more than a flattened PDF image. In these cases, the only way to reliably capture data is to implement some type of optical recognition.

Optical Character Recognition

For machine printed text, forms workflows can deploy Optical Character Recognition (OCR) to identify and extract information from an identified form. High-quality OCR engines can read multiple languages, allowing them to capture data from almost any source and send it to the next phase of an automated workflow. When extracting text, OCR tools can be set to carry out full-page extraction, which pulls text from the entire form, or zonal extraction, which focuses the data capture effort on a smaller, predetermined area. The latter approach is much more common with forms processing because it allows the application to set parameters on each zone to enhance performance. If the OCR engine is instructed to look for only numbers in one field and specific regular expressions in another, it will be able to identify and extract text faster and more accurately.

Intelligent Character Recognition

Of course, many physical forms submitted for processing will not be filled out with standardized digital fonts, but rather by hand using a pen or pencil. For these handwritten forms, Intelligent Character Recognition (ICR) will need to be deployed to read and extract field contents. Although identifying handwritten text is a much more challenging undertaking, the combination of a powerful ICR engine and good form design can greatly improve accuracy and processing times to keep information moving through automated workflows.

Optical Mark Recognition

Forms frequently use checkboxes or fillable bubbles to indicate important information. When scanned images are run through a forms workflow for processing, applications need to be able to quickly identify the presence of a mark and apply the conditional information associated with it. Today’s forms workflow tools utilize Optical Mark Recognition (OMR) to detect the presence or absence of marks automatically. They can also check the entire form to determine what information might be missing, such as essential fields or signatures.

Unlock Your Form Workflow Automation Potential with the FormSuite Collection

Building an automated workflow for forms processing requires a variety of software tools and specialized imaging expertise. It’s a challenging task that becomes even more difficult when developers are facing tight deadlines for other application features. With the right forms workflow SDKs, software teams can rapidly integrate the features needed to identify a variety of forms and capture vital data using full-page or zonal text recognition.

Accusoft’s Forms Collection bundles our powerful forms toolkits into a single, easily deployed package. Whether you’re using FormFix to identify and align forms, cleaning up scanned images for better recognition results with ScanFix Xpress, or deploying fast, accurate OCR and ICR with SmartZone, FormSuite provides all the SDK resources your team needs to unlock your application’s workflow automation potential. Learn more about what’s included with the FormSuite Collection by downloading our detailed fact sheet.

On July 12, 2022, Accusoft announced the latest update to PrizmDoc, its industry-leading document processing integration. The PrizmDoc 13.21 update improves existing features and adds key functionality related to format support, redaction capabilities, content conversion, and more, allowing developers to offer enhanced functionality within their applications. 

One of the main improvements in this release is to PrizmDoc’s Content Conversion Service (CCS). PrizmDoc now provides the ability to convert PDF documents to MS Word (DOCX) documents, making shared collaboration easier than ever before.

Other features and updates in this release include: 

  • High-Efficiency Image File Format (HEIF, HEIC) support for viewing, redaction, and conversion to JPG/JPEG, PDF, PNG, SVG and TIFF. 
  • PrizmDoc Viewer Markup Burner API now provides the ability to burn in redaction reason text for transparent (draft mode) redactions and provides the ability to remove PDF AcroForm fields. 
  • Improved performance of the PAS GET MarkupLayers API when using AWS S3 storage, which significantly reduces network traffic between PAS and S3.

PrizmDoc provides customizable document processing to help developers deliver in-browser document creation, editing, and collaboration functionality, to enhance their software applications.

For more information about PrizmDoc or to download a free trial, please visit our website.

About Accusoft: 

Founded in 1991, Accusoft is a software development company specializing in document processing, conversion, and automation solutions. From out-of-the-box and configurable applications to APIs built for developers, Accusoft software enables users to solve their most complex workflow challenges and gain insights from content in any format, on any device. Backed by 40 patents, the company’s flagship products, including OnTask, PrizmDoc™ Viewer, and ImageGear, are designed to improve productivity, provide actionable data, and deliver results that matter. The Accusoft team is dedicated to continuous innovation through customer-centric product development, new version release, and a passion for understanding industry trends that drive consumer demand. Visit us at www.accusoft.com.

Question

Can PrizmDoc handle password-protected files, such as PDFs or Excel files?
How would a user specify a password for a particular document?

Answer

It is possible to specify the password for a password-protected document when creating a viewing session in PrizmDoc. When sending a request to create a viewing session, you’ll use the password field in the request body to specify the password. For example…

POST http://localhost:3000/ViewingSession
Content-Type: application/json
{
    "source": {
        "type": "url",
        "url": "https://www.usability.gov/sites/default/files/creating-wireframes.pdf"
    },
    "password": "hunter2"
}

(Replace "hunter2" with the actual password)

Please note that even if a file needs a password and is not provided one (or is provided one that’s incorrect), the viewing session should still be created successfully. The easiest method to determine whether the password is needed/correct is to make a call to get the page. You can do this by making a GET request to the GetPage route using the viewingSessionId created earlier, like so…

GET pas_base_url/Page/q/0?DocumentID=u{viewingSessionId}

…be sure to replace pas_base_url with the root of your Prizm Application Services (PAS) instance (usually this is http://localhost:3000) and replace {viewingSessionId} with the actual value for viewingSessionId created in the previous step.

The above call will return 200 OK if the page load is successful. If a password is required/incorrect, you should see a return status code 480. There will be additional response headers called accusoft-status-number and accusoft-status-message, which should be 4001 and "Document requires a password", respectively.

You can see the above in greater detail in the product documentation here.

You can use this information to re-create a viewing session with the correct password.

Currently, there is a feature request planned for a potential future release of PrizmDoc to prompt the user for a password if one is required.

OCR vs ICR

The days of manually transcribing scanned documents into an editable, digital document are thankfully long behind most organizations. Error-prone manual processes have largely given way to automated document and forms processing technology that can turn scanned documents into a more manageable form with a much higher degree of accuracy. 

Much of transition was made possible by the proliferation of optical character recognition (OCR) and intelligent character recognition (ICR). While they perform very similar tasks, there are some key differences between them that developers need to keep in mind as they build their document and form processing applications.

How Does Character Recognition Technology Work?

Character recognition technology allows computer software to read and recognize text contained in an image and then convert it into a document that can be searched or edited. Since the process involves something that humans can do quite easily (namely, reading text), it’s easy to assume that this would be a rather trivial task for a computer to accomplish.

In reality, getting a computer program to correctly identify text and convert it into editable format is an incredibly complex challenge complicated by a wide range of variables. The problem is that when a computer examines an image, it doesn’t see people, backgrounds, or text as distinct images, but rather as a pattern of pixels. Character recognition technology helps computers distinguish text by telling them what patterns to look for.

Unfortunately, even this isn’t as straightforward as it sounds. That’s because there are so many different text fonts that depict the same characters in different ways. For example, a computer must be able to recognize that each of the following characters is an “a”:

When humans read text, they have a mental concept of what the letter “a” looks like, but that concept is incredibly flexible and can easily accommodate a broad range of variations. Computers, however, require precision. Programmers must provide them with clear parameters that help them to navigate unexpected variations and identify characters accurately.

Pattern Recognition

The earliest versions of character recognition developed in the 1960s relied on pattern recognition techniques, which scanned images and searched for pixel patterns that matched a backlog of font characters stored in memory. Once those patterns were located, the software could translate the characters into searchable, editable text in a document format. Unfortunately, the patterns had to be an exact pixel match, which severely limited how broadly the technology could be applied.

One of the first specialized fonts developed to facilitate pattern recognition was OCR-A. A simple monospace font (meaning that each character has the same width), OCR-A was used on bank checks to help banks quickly scan them electronically. Although pattern recognition libraries expanded over the years to incorporate common print fonts like Times New Roman and Arial, this still presented serious limitations, especially as the variety of fonts continued to grow. With one popular font finding website indexing more than 775,000 available fonts in 2021, pattern recognition needed to be supplemented by another approach to character recognition.

Feature Detection

Also known as feature extraction, feature detection focuses on the component elements of printed characters rather than looking at the character as a whole. Where pattern recognition tries to match characters to known libraries, this approach looks for very specific features that distinguish one character from another. A character that features two angular lines that come to a point and are crossed by a horizontal line in the middle, for instance, is almost always an “A,” regardless of the font used. Feature detection focuses on these qualities, which allows it to identify a character even the program has never encountered a particular font before. As the printed examples above demonstrate, however, this approach needs to take several ways of rendering the character “A” into consideration when setting parameters.

Most character recognition software tools utilize feature detection because it offers far more flexibility than pattern recognition. This is especially valuable for reading document images with faded ink or some degradation that could prevent an exact pattern match. Feature detection provides enough flexibility for a program to be able to identify characters under less than ideal circumstances, which is important for any application that has to deal with scanned images.

OCR vs ICR: What’s the Difference?

Optical character recognition (OCR) is typically understood to apply to any recognition technology that reads machine printed text. A classic OCR use case would involve reading the image of a printed document, such as a book page, newspaper clipping, or a legal contract, and then translating the characters into a separate file that could be searched and edited with a document viewer or word processor. It’s also incredibly useful for automating forms processing. By zonally applying the OCR engine to form fields, information can be quickly extracted and entered elsewhere, such as a spreadsheet or database.

When it comes to form fields, however, information is frequently entered by hand rather than typed. Reading hand-printed text adds another layer of complexity to character recognition. The range of more than 700,000 printed font types is insignificant compared to the near infinite variations in hand-printed characters. Not only must the recognition software account for stylistic variations, but also the type of writing implement used, the quality of the paper, mistakes, steadiness of hand, and smudges or running ink.

Intelligent character recognition (ICR) utilizes constantly updating algorithms to gather more data about variations in hand-printed characters to identify them more accurately. Developed in the early 1990s to help automate forms processing, ICR makes it possible to translate manually entered information into text that can be easily read, searched, and edited. It is most effective when used to read characters that are clearly separated into individual areas or zones, such as fixed fields used on many structured forms.

Both OCR and ICR can be set up to read multiple languages, although limiting the range of expected characters to fewer languages will result in more optimal recognition results. Critically, ICR does not read cursive handwriting because it must still be able to evaluate each individual character. With cursive handwriting, it’s not always clear where one character ends and another begins, and the individual variations from one sample to another are even greater than with hand-printed text. Intelligent word recognition (IWR) is a newer technology that focuses on reading an entire word in context rather than identifying individual characters.

To learn more about how OCR vs ICR technology and how they can transform your application when it comes to managing documents and automated forms processing, download our whitepaper on the topic today.

PDF viewers

Few file formats are as widely recognized and used as PDF. In fact, PDFs have become so commonplace that it’s hard to imagine a time when they didn’t exist. Most users don’t even give them much thought, knowing that all they need to do is click on the file and trust that their PDF file viewer will be able to open and render it accurately. But things weren’t always quite so simple before PDF viewers.

Origins of PDF

It’s easy to take document viewing and printing for granted today, but to understand the development of the PDF format, it’s important to look back at the document challenges facing organizations in the early 1990s. Businesses, government agencies, and universities were already using local area networks to share digital documents, but there was no guarantee that a document would display the same way on every machine. In addition to multiple competing word processor formats (such as Microsoft Word and Corel WordPerfect), there was no reliable way of viewing files containing images or other layout elements across different software and operating systems.

Around that time, Adobe co-founder John Warnock became focused on the idea of creating a standardized document format that would work across all operating systems and effectively function like digital paper. The primary goal was to ensure that the document contents would look the same no matter where they were viewed. That meant solving complex challenges like replacing unsupported fonts without distorting the document’s layout and distilling graphic parameters to flatten the file so it would load within seconds instead of minutes.

Adobe released the first version of the Portable Document Format (PDF) in 1993, but it would take some time for the format to catch on. “The world didn’t get it,” Warnock recalled in a 2010 interview. “They didn’t understand how important sending documents around electronically was going to be.”

The early years were rocky, largely because PDF was just slightly ahead of its time. Early PDFs had limited functionality and were slightly too large to be sent quickly over the early internet connections. That began to change in 1996, however, when the Internal Revenue Service (IRS) used PDFs to provide downloadable tax return forms and instructions online. The IRS also started using PDF files to digitize their internal document processes, largely phasing out their reliance upon paper documents for auditing. This adoption convinced many hesitant organizations that if the format was good enough for the IRS, then it was good enough for them as well.

The Growth of PDF File Viewers

In the years following the introduction of PDF as an open format, a unique “freemium” model emerged that helped to promote its use across a variety of industries. While developers sold software that could be used to create, convert, edit, and secure PDFs, they also offered more streamlined PDF file viewers for free. This ensured that anyone could easily open and view PDF files no matter what kind of computer or operating system they were using. 

Although early readers were offered as separate software applications, they quickly became available as libraries that could be integrated into an existing application. By integrating a PDF file viewer directly into an application, developers could provide secure PDF support without having to rely upon any external software.

Today, there are multiple PDF file viewers available, which often makes it difficult to identify the one that provides the right combination of rendering performance and security for a particular industry’s needs.

The Rendering Challenge

Rendering a PDF file accurately is a deceptively complex task because not every file is constructed in the same way. In fact, prior to the PDF standard being taken over by the International Standards Organization (ISO) in 2007, Adobe’s documentation surrounding the format was rather infamously vague, resulting in the creation of poorly optimized PDFs that third party readers had difficulty viewing properly. Some PDF file viewers address this challenge by adding new code to accommodate known issues, but this has the unpleasant side effect of giving the reader a larger footprint and potentially impacting performance.

This challenge has become even more complex in recent years given the popularity of mobile devices. Effective PDF file viewers must be able to deliver a responsive viewing experience that can adjust their user interface (UI) to different sizes and types of screens.

The Security Challenge

Security has always been an important consideration for PDF file viewers, but it has become a much more prominent concern since the first virus capable of embedding itself in PDF files was uncovered in 2001. Unfortunately, security vulnerabilities continue to be a problem with third party PDF readers, as evidenced by the multiple vulnerabilities discovered in Adobe’s PDF products in 2020. While developers have more PDF file viewers to choose from than ever before, finding a solution that doesn’t introduce security risks has become a high priority when building a new application.

One of the best solutions for resolving security challenges is to build PDF capabilities directly into their already secure applications. Viewing or creating a PDF file in an external program, such as third party software or even within a web browser, introduces a potential functionality and control gap. It’s difficult to control what can be done with a PDF once it travels outside the confines of a secure application environment, allowing it to be downloaded, viewed, and potentially altered. With PDFs set to continue as the de facto standard for digital documents, it makes more sense than ever for developers to give their applications the ability to manage those files natively, without having to interface with external software dependencies.

Find the PDF File Viewer That’s Best for You

Developers have many choices when it comes to integrating PDF viewing capabilities, which is why Accusoft has developed a broad range of PDF integrations to address every potential use case. Our Accusoft PDF Viewer delivers a high-speed, lightweight JavaScript library that offers out-of-the-box mobile support and requires only a few lines of code to install. Available as a free-to-use integration, it’s the fastest way to add dynamic PDF viewing capabilities to your application without any configuration headaches. 

If your application needs more than just support for PDF viewing, PrizmDoc Viewer provides production-scale annotation, redaction, and conversion for multiple file types. As an HTML5 viewer, PrizmDoc Viewer easily integrates into applications to create a secure environment for documents and images. Try it today using an online demo or download a free trial to see how PrizmDoc Viewer can transform the way your application handles and views documents. 

Question

I want the Thumbnail tab in PrizmDoc Viewer to be open by default. How can this be done?

Answer

A simple solution could be to simply implement a ‘click-on-the-button’ when you first open the Viewer, if you’re fine with the user being able to close the thumbnail pane:

$("[data-pcc-toggle=\"dialog-thumbnails\"]").click();

If you’d rather have the tab always open, in viewer.js there’s a function called toggleDialogs(opts) that checks for whether the thumbnail pane is being toggled through the opts.toggleID, and if so, adds openClass to the thumbDialog. You could modify this logic so that the thumbnail pane is permanently open.

How Accusoft’s PrizmDoc Improves Upon PDF.js

The ability to view PDF files has become an essential feature for web-based applications. While dedicated desktop readers are still common, the average user justifiably expects to be able to view documents without switching between applications. Thanks to browser-based PDF libraries like PDF.js, developers can both integrate the viewing features they need and build the next generation of PDF viewing integrations.

What Is PDF.js?

An open-source JavaScript PDF library, PDF.js was originally developed by the Mozilla Foundation in 2011 to serve as the built-in PDF viewer for the Firefox web browser. At the time, web browsers depended upon separate reader applications or browser plug-ins to view PDFs.

Unfortunately, this created several security risks. External plug-ins can contain malicious code or gather data that could endanger privacy. Downloading PDFs for local viewing is also potentially hazardous because it means the file must be removed from a secure application environment.

PDF.js uses Asynchronous JavaScript and XML (Ajax) to render PDFs as an HTML5 <canvas> element directly within a web application. Since it uses JavaScript for rendering, PDF.js is compatible with all modern browsers and doesn’t require any additional plug-ins.

In addition to being integrated into Firefox, the software was also made available as open-source code. This made it possible for independent developers to expand upon the core capabilities of PDF.js in the years since its release.

Should You Build or Buy a PDF.js Viewer?

The open-source availability of the PDF.js library makes it an attractive solution for software teams looking to add native viewing functionality to their applications. As with many open-source frameworks, however, developers may quickly run up against a few complications when building out a viewing solution from scratch.

Out-of-the-box, PDF.js consists of three basic layers:

  • Core Layer: The heart of the JavaScript PDF library, this layer parses and interprets binary instructions from the file itself.
  • Display Layer: This interface handles the actual rendering of the PDF into a <canvas> element.
  • Viewer Layer: The primary viewing interface that allows users to view and interact with the document.

While the core and display layers can handle most documents, PDF.js doesn’t support the full PDF specification and sometimes struggles with rendering lengthy, complex, or image-heavy files. Overall performance is often on the slow side, and the way text is rendered makes text search somewhat unreliable.

More importantly, PDF.js lacks out-of-the-box mobile support. The included viewer doesn’t provide essential mobile UI features like pinch-to-zoom. It also doesn’t respond dynamically to mobile screens to ensure that menus and tools remain usable on all devices.

Any developer looking to add PDF viewing and editing capabilities to their web applications using PDF.js will need to solve these core issues. While features like responsive, mobile-friendly viewing may have been less important when PDF.js first released in 2011, they are considered essential by most users today. Unfortunately, building out these capabilities takes time and resources, which is something that few development teams have in abundance.

Integrating a ready-made viewer that combines the solid foundation of PDF.js with the innovative features users expect allows developers to quickly meet their project needs without pulling attention away from key aspects of their application.

Integrate PDF Solutions with Accusoft

While PDF.js has long served as an adequate open-source PDF viewing solution for web applications, today’s average user simply requires more functionality than PDF.js can provide on its own. For developers who lack the time, resources, or expertise necessary to build those additional features, Accusoft can help.

For over 30 years, Accusoft has helped organizations add essential features like viewing, file conversion, document assembly, and image compression to their applications through an innovative line of SDKs and APIs. 

Our document lifecycle technologies are backed by multiple patents and have been incorporated successfully into a wide range of applications. Accusoft’s dedicated engineers provide ongoing support and work closely with customers to implement their specific use cases, ensuring that their software platform is delivering the best possible experience.

To learn more about PDF viewing and editing solutions from Accusoft, talk to one of our technology experts today.