Technical FAQs

Question

When using the PrizmDoc samples, the sample documents included are taking close to 1 minute to load in the viewer. The same also happens when uploading files into the sample.

The server processes are showing minimal impact on CPU and memory. However, the hard drive was spiking to 100% utilization sporadically.

Answer

We have found that Windows Defender with enabled Real-Time scanning can significantly impact performance. Once Real-Time scanning was disabled, we found this issue to be immediately resolved.

To disable Windows Defender, you can do the following:

  1. Right-click on the Windows Logo in the lower left-hand corner and select Control Panel.
  2. Select Windows Defender and then select Settings.
  3. Under the Real-Time protection section, slide the switch to Off.
Question

How can I improve the performance and memory usage of scanning/recognition in Barcode Xpress?

Answer

Barcode Xpress supports a number of optimization settings that can improve your recognition performance, sometimes up to 40%, along with memory usage. The best way to optimize Barcode Xpress is to fine-tune the properties of the Reader class to be specific to your application’s requirements.

BarcodeTypes

  • The best way to increase performance is to limit which barcodes Barcode Xpress should search for. By default, BarcodeTypes is set to UnknownBarcode which targets all 1D barcodes.

MaximumBarcodes

  • This property will instruct Barcode Xpress to halt searching after finding a specified number of barcodes. The default value is 100.

Area & Orientation

  • If you know the location or orientation of your barcodes in your image, specifying an orientation (such as Horizontal) and area can prevent Barcode Xpress from searching for vertical or diagonal barcodes, or in places where barcodes would not exist.

ScanDistance

  • Raising this value increases performance by applying looser recognition techniques by skipping rows of an image. However, this may fail to detect barcodes.

Finally, BarcodeXpress Professional edition does not impose a 40 page-per-minute limit on processing.

Question

We are interested to know how well the cloud offering
scales. For example, can we process thousands of transactions per second? Is there a limit to the number of transactions that we can use
concurrently?

Answer

The limitation will vary depending on the size of the documents and the type of work being performed on them.

PrizmDoc Cloud utilizes AWS servers to scale our services as demand increases, and we do currently rate limit total requests which should not exceed 100 requests within an eight second window. The window is rechecked every four seconds to determine if the rate limit is still in excess.

Question

How accurate is PrizmDoc Viewer’s auto-redaction process? Is there any estimate of the percentage of matches that it will fail to redact?

Answer

If you’re performing auto-redaction as described in our documentation, then all matches for the input regular expression will be redacted in the final document. However, it’s difficult to express that confidence in the form of a percentage. PrizmDoc is subjected to a suite of automatic testing to ensure that its services are behaving as intended, including redactions. We never release a new version unless it passes 100% of those tests.

That being said, the primary way that inaccuracy could enter the system is if you’re attempting to redact scanned documents that have been OCRed. In this case, it’s a matter of your OCR software’s accuracy, rather than the accuracy of PrizmDoc’s redaction process. If the searchable text of the document doesn’t accurately reflect the visible text on the page (for example, if a smudged “discovery” is incorrectly recognized as “disccvery”), then the auto-redaction will be unable to recognize it due to the incorrect input it has been given. This isn’t a problem if the documents were not OCRed.

Question

We are trying to create new viewing packages, however, in the [prizmdoc_process] table we see the process is 100% complete. However, the error code field indicates an Internal Error.

The document does not display in the viewing session and gives a 480 error. The following error code is:

{errorCode: “ViewingPackageNotUsable”}

What might be the issue?

Answer

When creating viewing packages, the PrizmDoc Application Services (PAS) uses the PrizmDoc Server to do the conversion work. In order for the viewing package to be created successfully, the PrizmDoc Server needs to be licensed and healthy.

If you see an error “ViewingPackageNotUsable” this can be related to the PrizmDoc Server either not being healthy or specifically not being licensed.

To verify the PrizmDoc Server status and if it is licensed, you can run the following command on the PrizmDoc Server in a web browser:

http://localhost:18681/admin

Question

I am trying to deploy my ImageGear Pro ActiveX project and am receiving an error stating

The module igPDF18a.ocx failed to load

when registering the igPDF18a.ocx component. Why is this occurring, and how can I register the component correctly?

Answer

To Register your igPDF18a.ocx component you will need to run the following command:

regsvr32 igPDF18a.ocx

If you receive an error stating that the component failed to load, then that likely means that regsvr32 is not finding the necessary dependencies for the PDF component.

The first thing you will want to check is that you have the Microsoft Visual C++ 10.0 CRT (x86) installed on the machine. You can download this from Microsoft’s site here:

https://www.microsoft.com/en-us/download/details.aspx?id=5555

The next thing you will want to check for is the DL100*.dll files. These files should be included in the deployment package generated by the deployment packaging wizard if you included the PDF component when generating the dependencies. These files must be in the same folder as the igPDF18a.ocx component in order to register it.

With those dependencies, you should be able to register the PDF component with regsvr32 without issue.

Question

When using the PrizmDoc samples, the sample documents included are taking close to 1 minute to load in the viewer. The same also happens when uploading files into the sample.

The server processes are showing minimal impact on CPU and memory. However, the hard drive was spiking to 100% utilization sporadically.

Answer

We have found that Windows Defender with enabled Real-Time scanning can significantly impact performance. Once Real-Time scanning was disabled, we found this issue to be immediately resolved.

To disable Windows Defender, you can do the following:

  1. Right-click on the Windows Logo in the lower left-hand corner and select Control Panel.
  2. Select Windows Defender and then select Settings.
  3. Under the Real-Time protection section, slide the switch to Off.
Question

How can I improve the performance and memory usage of scanning/recognition in Barcode Xpress?

Answer

Barcode Xpress supports a number of optimization settings that can improve your recognition performance, sometimes up to 40%, along with memory usage. The best way to optimize Barcode Xpress is to fine-tune the properties of the Reader class to be specific to your application’s requirements.

BarcodeTypes

  • The best way to increase performance is to limit which barcodes Barcode Xpress should search for. By default, BarcodeTypes is set to UnknownBarcode which targets all 1D barcodes.

MaximumBarcodes

  • This property will instruct Barcode Xpress to halt searching after finding a specified number of barcodes. The default value is 100.

Area & Orientation

  • If you know the location or orientation of your barcodes in your image, specifying an orientation (such as Horizontal) and area can prevent Barcode Xpress from searching for vertical or diagonal barcodes, or in places where barcodes would not exist.

ScanDistance

  • Raising this value increases performance by applying looser recognition techniques by skipping rows of an image. However, this may fail to detect barcodes.

Finally, BarcodeXpress Professional edition does not impose a 40 page-per-minute limit on processing.

Having the right file conversion tools in place can make or break an application. Developers frequently face the challenge of managing multiple file types within a consolidated workflow. Without effective conversion tools, users are forced to rely on external applications that compromise both efficiency and security.

Out of all the file formats developers must account for, PDFs remain among the most important. The ability to convert a wide variety of document and image file types into PDF format can provide an application with unmatched versatility. In fact, PDF conversion support is one of the keys to unlocking better workflow performance, security, and collaboration.

5 Reasons to Convert Files to PDF


1. PDF Format is Consistent

Sharing documents and images across different devices and operating systems can sometimes create problems if the recipient lacks the up-to-date software necessary to view the file properly. This is a particular challenge with documents created using Microsoft Word since the formatting could look quite different across different versions of the program. Since PDF files are designed to look the same no matter how they’re being viewed, the format is ideal for sharing. Both documents and images can display equally well as PDFs, so converting files into this format is a quick and easy way to make them accessible for viewing.

2. PDF Files Are Easily Compressed

Sharing large image files can be a challenge for many organizations. High-resolution JPEG or TIFF files are often too large to share over email or web-based applications. Converting them to compressed PDFs is a quick way to reduce file size for easier sharing while still retaining a copy of the original file. Since the compressed version is in PDF format, there is less chance of version confusion when someone needs to access the original source image.

3. PDFs Are Widely Supported

Although PDFs once required specialized viewing software, thanks to JavaScript-based libraries like PDF.js, they can now be viewed by a conventional web browser. For all intents and purposes, this has made PDF a universal file format that can be viewed on any device. Converting a file into PDF ensures that it will be accessible to anyone who is granted access to it, regardless of the device or operating system they’re using.

4. PDFs Offer Security Protections

For many organizations, protecting privacy and confidential information is incredibly important. Converting document and image files into PDF format allows them to take advantage of the standard’s security features. Passwords can be set to authorize viewing and editing access to a file, which not only helps to ensure privacy but also limits who can make changes to a file so version control is easier to maintain. Files can also be converted into PDF/A format for secure archival purposes.

5. PDFs Support Annotation Markups

Most PDF viewing solutions support some form of annotation markups, which allows multiple contributors to make notes and comments on a file. Converting documents or images into a PDF facilitates this collaboration while safely preserving the original version of the file for future reference. Since PDF viewers provide a variety of annotation tools, they offer a great deal of flexibility when it comes to marking up images and documents without having to depend upon specialized software. Image and document files with additional annotation layers can also be converted into flattened PDFs for easier viewing.

Converting Files to PDF Using ImageGear

Accusoft’s ImageGear provides an extensive array of file conversion tools that allow developers to easily save multiple document and image file types into PDF format. With these conversion capabilities built into the back end of their applications, developers can help customers streamline their file management.

Converting Microsoft Documents to PDF

ImageGear supports the conversion of multiple Microsoft Office documents, including Word (DOCX/DOC), Excel (XLSX/XLS), and PowerPoint (PPTX/PPT). The conversion engine supports all text elements, raster images, and graphic shapes for Microsoft Office Open XML and Microsoft Office 97-2003 formats. It can convert the entire document into a PDF as well as any designated page or page ranges. The following examples show how this can be done using C#.

Converting Microsoft Word to PDF

To convert a Microsoft Word document in its entirety, the first step involves loading the ImageGear filters to create the input and output instances: 

ImGearFileFormats.Filters.Add(ImGearOffice.CreateWordFormat());
ImGearFileFormats.Filters.Add(ImGearPDF.CreatePDFFormat());

For the next step, the PDF library needs to be initialized:

ImGearPDF.Initialize();

The ImGearFileFormats.LoadDocument method is then used to read all pages of the file:

ImGearDocument igDocument;
using (FileStream fileStream = new FileStream(inputFileName, FileMode.Open,
       FileAccess.Read, FileShare.Read))
{
   igDocument = ImGearFileFormats.LoadDocument(fileStream);
}
Finally, the ImGearFileFormats.SaveDocument method is used to save the output PDF: 
using (FileStream fileStream = new FileStream(outputFileName, FileMode.Create,
       FileAccess.ReadWrite))
{
   ImGearFileFormats.SaveDocument(igDocument, fileStream, 0,
       ImGearSavingModes.OVERWRITE, ImGearSavingFormats.PDF, null);
}

Converting Microsoft Excel and PowerPoint to PDF

The process for converting Excel and PowerPoint files follows the same basic format as converting Word files. First, initialize the input, then modify the sample code from above for the appropriate formats.

To initialize Excel:

ImGearFileFormats.Filters.Add(ImGearOffice.CreateExcelFormat());

To modify sample’s open file dialog for XLSX/XLS extensions:

ofd.Filter = @"DOCX files (*.docx)|*.docx|XLSX files 
(*.xlsx)|*.xlsx|XLS files (*.xls)|*.xls";

To initialize PowerPoint:


ImGearFileFormats.Filters.Add(ImGearOffice.CreatePowerPointFormat());

To modify sample’s open file dialog for PPTX/PPT extensions:

ofd.Filter = @"DOCX files (*.docx)|*.docx|PPTX files 
(*.pptx)|*.pptx|PPT files (*.ppt)|*.ppt";

Converting an Image File to PDF

ImageGear PDF supports the conversion of multiple image types into PDF format just as easily as it converts documents, but the process looks a bit different in code. After initializing PDF support for ImageGear.NET, the following C# example can be used to load an image file and then save it as a PDF page. The conversion process can be used for any file format that ImageGear supports.

using System;
using System.IO;

using ImageGear.Core;
using ImageGear.Formats;
using ImageGear.Formats.PDF;
using ImageGear.Evaluation;

public void SaveImageAsPDF(string inputFilePathName, string outputFilePathName)
       {
           try
           {
               const int FIRST_PAGE = 0;

               // Initialize evaluation license.
               ImGearEvaluationManager.Initialize();
               ImGearEvaluationManager.Mode = ImGearEvaluationMode.Watermark;

               // Initialize common formats.
               ImageGear.Formats.ImGearCommonFormats.Initialize();

               // Add support for PDF and PS files.
               ImGearFileFormats.Filters.Insert(0, ImGearPDF.CreatePDFFormat());
               ImGearPDF.Initialize();

               // Load required page from a file.
               ImGearPage page = null;
               using (Stream stream = new FileStream(inputFilePathName, FileMode.Open, FileAccess.Read))
                   page = ImGearFileFormats.LoadPage(stream, FIRST_PAGE);

               // Save page as PDF document to a file.
               using (Stream stream = new FileStream(outputFilePathName, FileMode.Create, FileAccess.Write))
                   ImGearFileFormats.SavePage(page, stream, FIRST_PAGE, ImGearSavingModes.OVERWRITE, ImGearSavingFormats.PDF);
           }
           catch (Exception exp)
           {
               // Write error to Console window.
               Console.WriteLine(exp.Message);
           }
           finally
           {
               // Call PDF engine terminating in any case.
               ImGearPDF.Terminate();
           }
       }

 

Add Conversion Flexibility to Your Application with ImageGear

Accusoft’s ImageGear provides applications with comprehensive conversion, annotation, and viewing support for PDF files. As part of the broader ImageGear collection, it also delivers powerful image processing capabilities and support for multiple document and image file types. These features can help turn any application into a robust document management platform capable of streaming workflows and empowering collaboration.

If you’re ready to see how the SDK will function as part of your development environment, start your free trial and get straight to the code.

The healthcare industry has undergone a profound change in the 21st century. A combination of technological advancements and regulatory pressures has encouraged providers to adopt new software platforms and update their existing IT stack. Gone are the days of physical file archives and cramped server rooms; today’s healthcare organizations are instead embracing innovative Internet of Things (IoT) devices, cloud-based file systems, and colocated server deployments that enhance their service capabilities and efficiency.

Unfortunately, not every provider is implementing new technology at the same pace. As science fiction author William Gibson famously observed, “The future is already here. It’s just not evenly distributed yet.” Today’s healthcare organizations must navigate a complex landscape of software solutions and overcome compatibility challenges in order to provide better service and care patients deserve.

The Drive for Interoperability

One of the key components of the 2010 Affordable Care Act was the push to promote interoperability among healthcare providers. The logic was fairly simple: for a healthcare marketplace to work effectively, patient information needs to be able to move freely between providers. That meant the myriad healthcare technology platforms being adopted by different organizations needed to be able to communicate with one another and share a common set of file formats.

The combined pressures of digital transformation and interoperability have led most hospitals and specialized health providers to implement picture archiving and communication systems (PACS). These digital archives and file management platforms allow providers to easily, store, retrieve, distribute, and present a variety of medical images, such as CT, MRI, and DR scans. They have largely replaced the expensive and complex manual filing systems used to store physical film and provided a far more secure means of protecting patient data.

Healthcare Image Processing

One of the advantages of shifting to digital scan formats is the ability to compress images while maintaining the ability to decompress them back to their original images. Poorly optimized compression tools can deteriorate the integrity of a high-resolution image, potentially obscuring key diagnostic indicators. In order to overcome these challenges, healthcare systems need image processing features capable of supporting rapid data compression, lossless transmission, and image cleanup.

Software developers working on PACS platforms and medical applications can turn to image processing SDKs like PICTools Medical to incorporate extensive compression and decompression capabilities into their solutions. These SDK tools can help overcome a variety of diagnostic imaging challenges, ensuring that complex medical files can be processed without any degradation of quality for easy viewing and management across multiple PACS platforms.

The Role of EHR Systems

Part of the push for interoperability included the adoption of electronic health records (EHR) systems, which digitized patient files to make them easier to share between healthcare providers. One of the challenges that came along with this adoption, however, was the handling of high-resolution medical images. While most healthcare providers have implemented some form of an EHR system, many of them do not have a PACS solution, especially if they don’t do any kind of medical scanning on-site. That means their ability to view certain types of medical images is quite limited. 

In theory, the medical industry has already solved this challenge with the development of the DICOM standard. Short for “digital imaging and communications in medicine,” DICOM was originally developed in a joint venture between the American College of Radiology (ACR) and National Electrical Manufacturers Association (NEMA) to ensure that healthcare providers would be able to view medical images no matter which vendor’s modality originally created them.

Unfortunately, the size and complexity of DICOM files often make them difficult for providers to manage. For instance, most EHR systems can transmit DICOM files (through a DICOM out or DICOM send functionality), but they often cannot view or annotate them. That’s because Windows doesn’t recognize DICOM files as image files. More importantly, large DICOM files often exceed the digital transfer limits of common communication channels like email. That leads to DICOM images being transferred on physical mediums, like discs or flash drives, that include viewer software.

Unlocking the Potential of DICOM 

Healthcare technology developers can help expand EHR functionality and realize the potential of DICOM by building viewing, conversion, and compression capabilities into their applications. Medical imaging SDKs like ImageGear Medical can not only convert DICOM files into a variety of easily viewable formats, but also perform essential cleanup functions to ensure that images maintain the highest integrity possible. High-level APIs can abstract or redact the details of a DICOM file to ensure the anonymity of the patent data as well as to compress it without degrading the image, making it easy to transfer files over secure channels rather than resorting to physical mediums or non-compliant public cloud platforms.

The ability to convert DICOM files into more easily managed formats also helps providers to share more information with patients. Diagnostic scans, for instance, can be quickly opened on IoT devices like a tablet and viewed entirely within the local application without having to use special equipment. Images can even be transferred directly to patients, allowing them to conveniently view them on their own devices. And thanks to lossless compression, medical offices can transmit the source DICOM files to other organizations when referring a patient to an outside provider.

Accusoft Medical Imaging Toolkits

With more than two decades of experience working with the imaging needs of the healthcare industry, Accusoft offers a variety of medical imaging toolkits to help software developers enhance their healthcare applications. Whether you’re developing a standalone imaging solution or adding viewing, compression, and cleanup features to your EHR system, our collection of SDKs and APIs can provide core medical image functionality so you can focus on building a better user experience and get to market faster. Learn more about how our medical imaging toolkits are improving outcomes in the healthcare industry and accelerating digital transformation trends.

FinTech covid stimulus

When President Joe Biden signed the $1.9 trillion American Rescue Plan Act relief package into law on March 11, 2021, millions of Americans looked forward to receiving a much-needed $1400 stimulus check from the government. Although many people would receive paper checks directly from the Internal Revenue Service (IRS), anyone who had previously filed their taxes electronically and had returns delivered to their bank accounts were eligible to receive their stimulus relief via direct deposit. The IRS set the date of March 17 for the delivery of stimulus funds, which would give sufficient time for payments to make their way through the complex Automated Clearing House (ACH) system used to transfer payments electronically.

FinTech Lenders to the Rescue

But on March 12, just one day after the landmark bill was signed into law, many FinTech banking customers received notifications that funds had already been delivered to their accounts. The digital banking startup Current bragged on Twitter that afternoon that it had already distributed $600 million to 250,000 customers. On March 15, the FinTech lender Chime announced that it had paid about $3.5 billion to more than one million customers over the weekend. Chime had previously made headlines the previous spring when it advanced stimulus funds from the CARES Act to customers before the government actually made the money available.

Unsurprisingly, the announcements caused quite an uproar from customers at traditional banks that did not start releasing funds until the previously announced March 17 date. Despite many of the accusations leveled at these lenders, however, the discrepancy had nothing to do with banks deliberately withholding funds and everything to do with the unique business model of leading FinTech lenders.

In the case of Chime, for instance, the company frequently makes payment funds available to customers as soon as the transfer is initiated, rather than waiting for it to clear through the ACH. “I guess you could argue we’re taking a risk,” said Chime co-founder and CEO Chris Britt. “But we’ve been told by the Federal Reserve that the money is coming so we don’t think it’s that much of a risk.” 

Traditional banks were quick to respond by saying that they could not make funds available before March 17 because that was the date set by the government for the money to actually be transferred. For FinTech companies with higher risk tolerance, the delay provided a unique opportunity to demonstrate the benefits of digital lending applications. During the first wave of stimulus checks in April of 2020, mobile banking app registrations increased by 200% over the previous month as Americans rushed to embrace various forms of digital banking.

The Flexible Features of FinTech Applications

Part of the reason why FinTech lenders are willing to offer more generous services to customers is that they often assess risk differently than traditional banks. Armed with sophisticated algorithms and data capture tools, FinTech applications are able to gather more information about customers and lending sources to create a more accurate risk profile.

Over the last two decades, FinTech developers have worked hard to build the digital platforms that innovative firms are using to offer these services. These software solutions need to be flexible enough to process information quickly and provide essential functionality that helps both FinTech firms and their customers to view and share information quickly and easily.

Forms Processing

Structured forms are an essential tool of the financial services industry, whether it’s a loan application or an IRS tax form. The faster those forms can be processed, the more quickly firms can deliver money into the hands of their customers. That’s why FinTech developers need to make sure they’re incorporating the forms processing tools that make it easy to automate data capture. Given that the latest round of COVID stimulus funds are based upon tax return information, many customers will be scrambling to update their records as quickly as possible. By integrating the tools to process that data with haste, FinTech developers can help firms keep pace with the needs of their clients.

Easy Viewing

While FinTech developers are primarily building applications for lenders, they should always keep in mind that a solution that doesn’t provide a positive customer experience will have trouble catching on in a crowded marketplace. Today’s banking customers expect transparent and intuitive applications that allow them to quickly view their financial records and check the status of applications or loans. By building HTML5 viewing capabilities into their FinTech solutions, developers can help customers track the status and history of their finances, which is certainly a major concern as they monitor the status of their stimulus payments.

Interactive Tools

With all of the nuances surrounding COVID stimulus payments in the latest round of legislation, many customers will be turning to their FinTech lender to understand how much money they can expect to receive based on their eligibility. A well-designed spreadsheet may be able to provide this or similar information much more quickly than building a dedicated tool within an application, but downloading XLSX files can be a hassle for many people, especially for customers who primarily interact with their FinTech bank using a mobile device. By giving firms the ability to securely embed spreadsheets into their applications, developers can help them to quickly share tools and resources with customers, regardless of what kind of device they’re using.

Empowering the FinTech Future with Accusoft

Accusoft’s collection of SDK and API integrations allow FinTech developers to build a broad range of features into their applications to streamline processing and accelerate vital financial services. 

Our FormSuite forms SDK collection can automate form identification and OCR data capture to help FinTech applications maintain their speed advantage when it comes to processing applications and loans. For financial platforms that need comprehensive viewing functionality, PrizmDoc Viewer’s HTML5 viewing, annotation, and redaction capabilities can turn any platform into a powerful document viewer that helps users handle most of their financial business purely through their FinTech application. 

And when it comes to embedding interactive spreadsheets to provide quick reference and calculations for various services, PrizmDoc Cells allows developers to bypass the difficult work of building that functionality from the ground up. To learn more about how Accusoft integrations are powering the next generation of FinTech applications, visit our financial services page and download our FinTech integrations fact sheet.

OCR segmentation

Today’s high-speed forms processing workflows depend on accurate character recognition to capture data from document images. Rather than manually reviewing forms and entering data by hand, optical character recognition (OCR) and intelligent character recognition (ICR) allow developers to automate the data capture process while also cutting down on human error. Thanks to OCR segmentation, these tools are able to read a wide range of character types to keep forms workflows moving efficiently.

Recognizing Fonts

Deploying OCR to capture data is a complex undertaking due to the immense diversity of fonts in use. Modern character recognition software focuses on identifying the pixel patterns associated with specific characters rather than matching characters to existing libraries. This gives them the flexibility needed to discern multiple font types, but problems can still arise due to spacing issues that make it difficult to tell where one character ends and another begins.

Fonts generally come in one of two forms that impact how much space each character occupies. “Fixed” or “monospaced” fonts are uniformly spaced so that every character takes up the exact same amount of space on the page. While not quite as popular now in the era of word processing software and digital printing, fixed fonts were once the standard form of typeface due to the technical limitations of printing presses and typewriters. On a traditional typewriter, for example, characters were evenly spaced because each typebar (or striker) was a standardized size.

From an OCR standpoint, fixed fonts are easier to read because they can be neatly segmented. Each segmented character is the same size, no matter what letters, numbers, or symbols are used. In the example below, the amount of space occupied by the characters is determined by the number of characters used, not the shape of the characters themselves. This makes it easy to break the text down into a segmented grid for accurate recognition.

OCR segmentation:  Monospace Font Example

“Proportional” fonts, however, are not uniformly spaced. The amount of space taken up by each character is determined by the shape of the character itself. So while a w takes up the same space as an i in a fixed font, it takes up much more space in a proportional font.

OCR segmentation:  Fixed versus proportional font

The inherent characteristics of proportional fonts makes them more difficult to segment cleanly. Since each character occupies a variable amount of space, each segmentation box needs to be a different shape. In the example below, applying a standardized segmentation grid to the text would fail to cleanly separate individual characters, even though both lines feature the exact same character count.

Proportional Font Example

Yet another font challenge comes from “kerning,” which reduces the space between certain characters to allow them to overlap. Frequently used in printing, kerning makes for an aesthetically pleasing font, but it can create serious headaches for OCR data capture because many characters don’t separate cleanly. In the example below, small portions of the W and the A overlap, which could create confusion for an OCR engine as it analyzes pixel data. While the overlap is very slight in this example, many fonts feature far more extreme kerning.

Example of Kerning

In order to get a clean reading of printed text for more accurate recognition results, OCR engines like the one built into Accusoft’s SmartZone SDK utilize segmentation to take an image and split it into several smaller images before applying recognition. This allows the engine to isolate characters from one another to get a clean reading without any stray pixels that could impact recognition results.

Much of this process is handled automatically by the software. SmartZone, for instance, has OCR segmentation settings and properties that are handled internally based on the image at hand. In some cases, however, those controls may need to be adjusted manually to ensure the highest level of accuracy. If a specific font routinely returns failed or low confidence recognition results, it may be necessary to use the OCR segmentation properties to adjust for font characteristics like spaces, overlaps (kerning), and blob size (which distinguishes which pixels are classified as noise).

Applying ICR Segmentation

All of the challenges associated with cleanly segmenting printed text are magnified when it comes to hand printed text. Characters are rarely spaced or even shaped consistently, especially when they’re drawn without the guidance of comb lines that provide clear separation for the person completing a form.

Since ICR engines read characters as individual glyphs, they can become confused if overlapping characters are interpreted as a single glyph. In the example below, there is a slight overlap between the A and the c, while the cross elements of the f and t are merged to form the impression of a single character.

ICR Segmentation Properties

SmartZone’s ICR segmentation properties can be used to pull apart overlapping characters and split merged characters for more accurate recognition results. This is also important for maintaining a consistent character count. If the ICR engine isn’t accounting for overlapped and merged characters, it could return fewer character results than are actually present in the image.

Enhance Your Data Forms Capture with SmartZone

Accusoft’s SmartZone SDK supports both zonal and full page OCR/ICR for forms processing workflows to quickly and accurately capture information from document images. When incorporated into a forms workflow and integrated with identification and alignment tools like the ones found in FormSuite, users can streamline data capture and processing by extracting text and routing it to the appropriate databases or application tools. SmartZone’s OCR supports 77 distinct languages from around the world, including a variety of Asian and Cyrillic characters. For a hands-on look at how SmartZone can enhance your data capture workflow, download a free trial today.