Improving Intelligent Character Recognition (ICR) Accuracy with Better Form Design
Rick Scanlan, Accusoft Director of Sales Engineering
Throughout time people have been looking for ways to gather information. The invention of the printing press allowed large-scale production of documents and printing of the earliest forms. Until the 1980s, information collected from forms was tabulated by hand or manually entered into a computer. Hand print recognition technology, more commonly known as Intelligent Character Recognition (ICR), has progressed significantly since that time, but the accuracy and productivity of forms processing is highly dependent on form design.
There are many factors to consider when designing a form to collect hand printed responses. First and foremost, the form needs to be easily understood by your target audience. The form also needs to constrain the response area and clearly identify where the user should write their responses.
Remember that the person filling out the form is usually out of your control. No matter how well you design your form, there will always be responses that can’t be read automatically. You can encourage the form fillers to write neatly, and keep their responses within the spaces allotted, but there will always be people who don’t read instructions (or ignore them) and assume that the form will be read by a human, not by a computer. They may do things like writing a character by mistake, then drawing a big “X” over it to “delete” it. Some people have poor handwriting, or always write in cursive.
Accuracy vs Confidence
Although they’re often used interchangeably, “accuracy” and “confidence” have two different meanings with regards to ICR software. Accuracy represents the percentage of actual text that is read correctly. Since character recognition applications don’t actually know when they misread a character, they cannot self-report accuracy. It can only be calculated after the recognition process by comparing the “ground truth” (the actual text) with the application’s reported recognition results.
“Confidence,” on the other hand, represents how certain the application is that it has identified a character correctly. Each character result generally has a confidence value ranging from 0 to 100, which can be calculated based upon a variety of recognition characteristics. Confidence values can also be returned for each line of characters or each field in a form in addition to each character individually.
If the ICR engine’s confidence does not exceed this value when reading a given character, it may reject the character and replace its text output with a placeholder until it can be reviewed manually. This is typically done when the engine is unable to determine what a pattern of pixels represents. Some engines (such as Accusoft’s SmartZone) can instead be configured to report the character result with the highest confidence or return a list of possible characters, each with individual confidence values. A final determination can then be made through human review or other data validation operations.
The industry accuracy average for ICR applications is about 70%. That means that three out of every ten characters are read incorrectly or aren’t recognized with a high enough confidence to be considered accurate. One should never expect 100% accuracy in any forms processing project, but a successful ICR application should exceed 70% accuracy. A rate of 85% or higher is considered good (although that’s still 15 bad characters out of every 100). With a little planning and some basic form design elements in place, however, you can usually exceed the 70% threshold.
In fact, since the way people fill out your forms has such a big impact on recognition results, taking small steps to improve compliance is immensely beneficial. Without changing any other aspects of your form, simply changing user instructions can provide a significant improvement in recognition rates.
5 Simple Ways Improve ICR Software Recognition Rates
- Tell the user that the form will be processed by a computer.
- Stress the importance of writing plainly, carefully, and clearly.
- Ask them to use block letters and avoid cursive handwriting.
- Put the instructions in bold at the top of the form, or just above the first field.
- Show character examples such as how an “A” or a “2” should be formed.
Field Design Considerations for ICR Software
Properly laying out the areas for printed responses can make the most significant impact in accurate hand print content recognition. A common mistake in field design is to provide a freeform area for a response. This design is often a simple blank line where people should write. Without any character restraints, people will write in cursive, run their characters together, write on top of the line, or write multiple lines in a single line response area. All of these factors will have a serious impact on intelligent text recognition accuracy.
A form needs to have a defined response area for each character, encouraging character separation. Some approaches for character separation work better than others, and are described below.
Comb lines are horizontal lines with small vertical separators called tick marks. This is traditionally the most common type of hand print form design, often used in manual data entry applications. However, it’s not as well suited for automated ICR processing as other approaches. While the tick marks may encourage people to separate their characters, they rarely ever write the characters within each space. The spacing between the vertical lines on many forms is frequently too close together, making it almost impossible for the average person to stay between the lines. The height of the tick lines also plays an important role in encouraging character separation.
If you use comb lines, provide plenty of room between each of the vertical tick marks. Make the tick marks tall enough to encourage people to write between them. A vertical height at least half the height of the expected character is usually sufficient.
Example of a Poor Comb Line
Example of a Good Comb Line
Character boxes are usually the best method to encourage character separation. A good character box design will allow users to write their characters completely within each box. Unfortunately, many forms contain boxes that are too small and too close together. People often can’t write small enough to keep an entire character within a box. Pencil lead creates strokes that are usually much wider than with pens, making it even harder to constrain the character. The following are some general guidelines for designing character boxes.
Each box should be square in shape. Rectangular boxes with the height taller than the width can make the user feel like they need to squeeze their characters into the space. This often results in characters written in a compressed vertical form, reducing accuracy. A square shape encourages wider, more normally formed characters.
Single character response locations, such as for Male (“M”) or Female (“F”), should be provided in a single box separated from other responses.
Multiple character response locations, such as a Name field, may contain separated boxes if space permits.
They could also be joined together when space is a consideration. If joined, a thick separator between the response locations, at least one-fourth the width of a response area, should be used to discourage characters entering other boxes.
Individual fields should be separated by enough space to easily identify where one field stops and the next starts. Spacing of at least 1.5 box widths is recommended to prevent users from interpreting the space as a valid character location.
Rows of fields stacked vertically should be separated by at least one half the height of an individual box.
Boxes can be printed with either solid black lines or dropout colors, depending on the scanning and forms processing technology used. Some forms processing technology, such as Accusoft’s FormSuite software development kit (SDK), best performs form identification and alignment when all boxes and form contents are retained. Software-based form dropout is used to remove the boxes from the image after scanning. The original image with boxes intact may then be archived for future reference.
Several forms processing systems require dropout colors to be used when printing forms. For example, a form is printed in red ink, and a red bulb in the scanner eliminates the red content when the image is captured. Unlike these types of forms processing technology, the FormSuite SDK doesn’t require any special printing, paper, or inks. It provides much greater printing flexibility and reduced printing costs. The use of general purpose scanning technology without requiring special bulbs may also reduce capital costs.
Paper Thickness and Bleed-Through
The quality of paper can impact recognition accuracy in dual-sided forms. Form paper should be thick enough to prevent the back side content from bleeding through when scanned. Fields on form fronts and backs may also be offset to ensure that any bleed-through content from one side will not interfere with field recognition on the other.
Example of Bleed Through
Processing a Hand Print Form
Intelligent character recognition accuracy can often be increased through image enhancement and other pre-processing activities. Many scanners today include image enhancement technologies that will create a good representation of the original image. This enhanced image may work well for viewing or archival purposes, but it may not be the best to use for content recognition. Lines or boxes in the image may interfere with field recognition. Dot shaded fields may prevent easy recognition of filled content. Filled forms might be received via fax at a low resolution, where built-in image enhancement is not available. The use of post-scan image enhancement processes can significantly improve forms processing and intelligent text recognition.
A temporary copy of the image can be created solely for use with the ICR application. Enhancements are performed that directly impact recognition. If poor recognition results are received, additional enhancements may be performed, looping through a series of “enhance – attempt to recognize – enhance – attempt to recognize” processes until the field is read with high confidence or a decision is made to route the image for manual data entry. Once recognition is complete, the temporary image is deleted and the original image is archived.
Certain enhancement processes are specifically designed to improve character recognition, especially when you don’t have control over the form design. For example, forms that contain shaded fields in response areas can be very difficult to recognize. Dot shading removal with character smoothing can significantly improve recognition of those fields.
Software-based form dropout—removal of background form content—can allow recognition of content that has been written over master form elements. For example, users completing forms with comb lines often write on top of the lines, resulting in very difficult recognition. An automated comb line removal process will remove the comb lines and reconstruct the intersecting characters, allowing for accurate recognition.
Before Comb Removal
After Comb Removal, with Character Repair
Improve ICR Application Performance with Focused Recognition and Data Validation
Some fields are designed to allow only certain characters to be entered. For example, a date field may allow only digits, or only digits, dashes, and slashes. A “Male/Female” field may only allow the characters M and F. Ensure that your form contains instructions or examples for each field to ensure the user knows what characters are allowed. ICR technology such as Accusoft’s SmartZone ICR/OCR component allows definition of allowable characters, increasing accuracy by focusing the recognition engine towards specific characters.
Remember that the industry average for hand print recognition is only 70%. Data validation and correction is critical to a successful hand-printed forms recognition system. Use recognition confidence values to locate suspect characters. Use two or more ICR engines in a voting process, comparing the results from each engine to determine the highest confidence results. Recognized data should be compared against database tables, dictionaries, lookup tables, or other data validation tools.
A “key from image” process is typically required to validate low confidence data. You should develop a process to display suspect characters or fields to a human for manual data entry. Human interaction is the most expensive part of any data capture process, so any efforts you can take, such as strong form design or additional image enhancement processes, will easily pay for themselves when compared to the cost of manual data entry.
Test Your Form
It’s critical to develop a prototype of your form then test it on a sampling of actual users. Present the form to people who have not seen any previous versions and ask them to complete it. Statistical sampling and analysis is helpful when testing forms that will be used on a large scale. Forms for smaller audiences do not require scientific analysis. Just be sure that you employ representative users in the test.
You should also test your recognition processes with enough sample data to get a good sense of the results. Identify weaknesses in the form or recognition, make changes, then retest to confirm improved results.
A Note About OMR
Optical Mark Recognition (OMR), sometimes known as “mark sense,” is the analysis of form locations to determine if a mark is present. Examples of OMR zones include check boxes on a form to designate male or female, multiple choice responses on a high-stakes educational exam, or diagnosis results on a medical form. The OMR response areas may be a single box, multiple response zones such as a “check all that apply” field, or a true/false designation.
Many forms contain some type of OMR field. Designing an OMR field is simpler than for character recognition, but still requires careful consideration. Whether you use an oval bubble, square box, or open brackets, be sure the area is large enough for the user to easily mark within the designated space.
Common OMR field design errors include making the box too small for people to easily mark within the zone, or printing the boxes too close together, resulting in more than one box containing the mark. Some users will circle an OMR response area instead of filling in the box. Similar to character responses, providing clear instructions and example marks can significantly improve recognition results. Even great instructions will not prevent some people marking a zone in error then drawing a big “X” over in an attempt to “delete” the mark. Business rules must be developed to handle multiple mark situations and manual key-from-image operations are usually required to determine user intent.
Many factors influence the accuracy and success of a hand print forms processing system. The extra time and consideration spent in forms design will pay strong dividends in recognition accuracy and reduced costs for manual data entry. Carefully consider your target audience to design a form that will be easily understood and completed, and can be easily recognized and processed by ICR software.
Rick Scanlan joined Accusoft with the acquisition of TMSSequoia in December 2004. With 27 years of experience in the document imaging market, Rick has served in a variety of technical, business development, and corporate management roles. He has developed extensive expertise in a wide variety of imaging technologies including document viewing, information capture, image enhancement, and forms processing. Rick currently manages Accusoft’s sales engineering/pre-sales team and helps define Accusoft’s product strategy and future development. A native of Oklahoma, he earned Bachelor of Science degrees from Oklahoma State University in Business Management, Economics, and Management Science and Computer Systems.