How FormFix Achieves Better Forms Recognition Accuracy
Without accurate forms recognition, developers will have a hard time building effective forms processing applications. The primary advantage of forms processing workflows is the ability to automate the data capture process, but the benefits of automation quickly evaporate if the system continually misidentifies forms or can’t align document images cleanly for accurate data capture. Accusoft’s FormFix SDK ensures that forms processing applications get off to the best start possible by quickly matching and aligning form images to predefined templates.
When to Use FormFix
FormFix works with structured forms that feature a standardized layout with fields located in fixed positions. The SDK has a number of use cases as part of a broader forms processing and data capture workflow. Its primary function is to identify form images and route them to the proper destination. In some instances, this will mean handing the recognized form off to the SmartZone integration, which performs optical character recognition (OCR) and intelligent character recognition (ICR) to extract printed text from form fields. If data capture doesn’t need to be done immediately, the form can instead be routed to a storage location for later reference.
But FormFix can do more than simply identify forms. It also features powerful optical mark recognition (OMR) capabilities, which allows it to detect marks in fillable bubbles or checkboxes that are commonly used on a wide range of forms. Without OMR, a forms processing application will be forced to rely on manual data entry for any form that contains these marks, which typically indicate information like marital status, health history, ethnic background, or a variety of demographic data. Deploying OMR to process these forms automatically helps to minimize the risk of human error and speeds up processing times. In addition to being able to read single or multiple marks, FormFix can also use OMR to detect the presence of a signature on a document.
Creating Master Form Templates for Identification
Before FormFix can start identifying form images, it first needs FormSets to work with. A FormSet consists of several FormDefinitions, each of which represent a document form page. Every FormDefinition object contains compressed image data of a form template and indicates the fields from which data can be extracted. The individual fields can also have specific instructions associated with them that should be performed at the time of processing. This could include despeckling or other forms of image enhancement or clean-up.
FormSets are created by the FormDirector API, which allows developers to designate what information needs to be extracted and where on a form it should be extracted from. Setting up a FormSet template for a IRS 1040 form, for instance, would involve designating which fields on an unfilled form will be matched and aligned so the information contained in them can be captured accurately. Developers can also create their FormSets using the FormAssist application, which is a graphical interface for FormDirector that allows them to easily upload blank form images and specify how each field should be handled during processing.
How FormFix Identifies Form Images
After a form image is uploaded and cleaned up (usually using the ScanFix Xpress SDK), it can be identified and aligned for data capture. FormFix uses its forms recognition processor to examine the input image and compare it to the available FormSets on file. It does this by looking at the FormDefinitions within the FormSets and matching their embedded template images to the current input image. Once a potential match is identified, FormFix selects the appropriate template and provides a confidence value for all identification candidates.
The FormFix alignment algorithm then takes over, placing the input image on the form template and making a series of adjustments to ensure that the field areas line up as precisely as possible. It can also perform form image drop-out, which removes pre-printed graphical elements, found in the template, like form field boxes and instructional text and leaves only the filled-in information behind. This helps improve recognition accuracy, whether the application is using SmartZone OCR/ICR or deploying FormFix’s OMR capabilities.
Improving FormFix Forms Recognition
Although FormFix is capable of quickly matching form images with the master forms the application has on file, there are a few steps developers can take to streamline the forms recognition process and improve workflow performance. For example, FormFix can be set to only compare images at 90, 180, and 270 degrees, or to only exert certain amounts of effort during forms identification.
When setting up templates, developers can define what image operations need to be completed for each image input. These parameters can be set at different hierarchies, so some operations may be applied to all forms while others are only applied to specific FormDefinitions or form fields. This eliminates unnecessary image processing operations that may slow down workflows while still ensuring that consistent adjustments are being made where they’re needed.
If possible, barcodes can be affixed to different form types to quickly indicate which template needs to be referenced for the form alignment process. This allows FormFix to bypass the identification process and proceed directly to aligning the form images for dropout and recognition.
Accelerate Your Forms Processing Workflow with FormFix
Accusoft’s FormFix SDK helps your forms processing application to quickly identify form images, prepare documents for zonal and full page OCR/ICR, and extract information with OMR functionality. Fully customizable to meet the needs of your forms workflow, FormFix also includes a variety of image cleanup tools that can remove imperfections and noise to improve recognition accuracy.
To learn more about the capabilities of the FormFix SDK and see how it fits into a broader forms processing solution, download our FormFix Fact Sheet today or contact one of our integration experts for more information.