Technical FAQs

Question

Is it possible to automatically annotate a document, similar to the Auto-Redaction feature, using PrizmDoc?

Answer

An auto-annotation feature isn’t an out-of-the-box feature but with some work, it can be done. This would involve creating a searchTask and using the information from it to programmatically create XML markup that can be used in the MarkupBurner.

To do this you would need to create a searchTask for the pattern you would like to annotate. You can then get the results of the searchTask as JSON which will contain all occurrences of that pattern/search. Each search result will include the selected text, the page on which it occurs, the starting index of the result, and the dimensions and coordinates of the bounding rectangles for that search result.

All this information can be used to construct the markup XML to add the annotations with the markup burner.

Once you have constructed the XML you would post to the MarkupBurner with the XML as the body to burn the document.

Question

When should I apply image cleanup operations on my document images?

Answer

There are a number of cleanup operations that you can use to make an image more suitable for a particular application. What you observe visually on the image and how you perceive its impact on your project is the most important. For example, if you’re noticing very many random specks on your image, and you’re planning to use OCR, then you may want to try a depseckle or blob removal operation first. If the content in your image looks a bit slanted, you could try a deskew or rotate operation. In some cases, using a line removal operation on forms that have grid fields could be helpful also. The amount of image cleaning you may need to do can very from project to project. There’s not a one shot cleaning operation that will always work for all images. But, observe the nature of the noise and interference in your images to determine what general parameters appear to provide the best results.

By default, the Cells Server will use internal storage which is only suitable for evaluation and development. When deploying to a production environment, best results come from configuring the server to use enterprise level storage (S3) versus filesystem.

Question

What are the best quality images to use when processing form data and recognition?

Answer

In all cases, you’ll want to have your images as clear and as clean as possible. For any particular procedure, please consider the following: OCR and ICR: Capture images in at least 300 DPI resolution. Ideally, working in black and white will allow the objects of interest on your image to be better defined and recognized. Free the image form all noise as much as possible. As if a human was reading it, you’ll want the text objects on the image to be as legible as possible. For ICR, make sure that the characters are printed (no cursive text, etc). Barcode recognition: As with OCR and ICR, capture images in at least 300 DPI and working with black and white content can provide excellent results. You’ll also want to make sure that the bars in the barcodes are clearly defined on the image and are not mal-formed (for example, the barcodes should have the proper start and stop sequence, etc). As always, clear as much noise from the image as possible. Forms matching and registration: As with the prior 2 items above, capture your documents in at least 300 DPI. Make sure that your resolution is consistent between your form templates and incoming batch images as well. Form templates should only contain data that is common to every image that is being processed (i.e. – Form fields, the text that appears on the blank form itself, etc). The template should not have filled-in field information as this will affect the forms matching process.

COVID-19 insurtech

 

From large payouts and losses in some segments to rapid growth in others, the insurance industry has experienced seismic shifts due to the COVID-19 global pandemic. To keep some semblance of normalcy during these changes and the aftermath, organizations are turning to InsurTech solutions for help. 

According to Deloitte, InsurTech investments remain strong, with COVID-19 simply shifting priorities to virtual customer engagement and operational efficiency rather than cutting budgets. Data collected by Venture Scanner indicates that the global InsurTech market generated $2.2B in the first half of 2020.


The Challenge of Advancing a Product to Meet Immediate Needs

Tasks once completed manually at insurance companies can bottleneck an entire system in just a few days and prevent insurers from winning much-needed revenue. For this reason, providers are scrambling to make fast efficiency gains while minimizing risks that could lead to unrealized business opportunities due to slow processing. When it’s feast or famine, with customers either signing up or making claims in droves, there’s no time to waste.

As a product developer in the InsurTech space, this puts you in a precarious position. After all, how can you add functionality overnight when it takes time to build those new capabilities? While some organizations may have the available workforce to rally and build new features quickly, most don’t. 

If you’re like most in the development space, finding and retaining talent is a challenge. What’s more, they’re likely already looking at a project backlog spanning many months—if not years. For this reason, augmenting existing solutions with white-label, third-party plug-ins is an attractive option. Now, let’s turn our attention to the type of functionality insurers need to navigate recent shifts.


4 Essential Capabilities for the Insurance Industry in the Wake of COVID-19

Pew Research found that by June of 2020 roughly 3% of Americans had already made a mass exodus from highly populated areas like New York, New York and San Francisco, California due to challenges posed by the COVID-19 global pandemic. This number has likely grown since June and will likely continue to grow as hubs of economic growth continue to shift and settle. 

For each insured individual that moves and retains insurance coverage, there’s paperwork. For many, they’ll even switch providers as their previous provider may not be able to provide competitive rates in their new location. The sheer change-management involved in migrations of this scale is daunting. Without the ability to process requests faster, insurance companies could find themselves struggling to keep up. 

To help your insurance industry clients effectively navigate the road ahead, your applications need to include greater data-capture, data-conversion, and optical character recognition technologies that reduce the need for manual intervention in document processing. 

1. Data Capture Efficiency  

As the number of file formats increases, insurance organizations need the ability to quickly capture and process hundreds of different image formats. Beyond simply capturing them, they often also need to aggregate and convert those multiple formats into a single, secure, and digitally accessible PDF.

Rather than trying to build everything from scratch, sometimes partnering with a third-party software developer can give you a leg up on all the delivery time associated with expanding feature sets for the insurance industry.  

Essential Capabilities Should Include:

  • Support for multiple file formats
  • Automated image-correction and optical character recognition technology
  • Clean integration that maintains or improves processing speed 

Once data is captured, it then needs to be managed. To explore document management capabilities to consider when expanding your feature set for the insurance industry, click here

2. Identify Form Fields

Whether potential buyers are requesting new policies or current customers are evaluating existing policies, precise and efficient data-capture technologies can improve the ability of insurers to access important data and analyze policies. Adding these capabilities requires quite a bit of strategy. First, one must consider the core challenges involved in effective data capture: 

  • Poor inputs that aren’t easy to correct and capture 
  • Poorly designed forms that reduce image recognition success  
  • Imaging technology that can’t recognize a robust number of file formats and fonts 

When contemplating the structure of boxes for character collection, our experts found that using a square shape rather than a rectangle results in less data loss. While rectangles may, at first, appear to save space and therefore be a more effective option, research showed that they typically don’t provide the average user enough space to clearly write letters or characters without interfacing with the boundary lines. Thus, square boxes improve data transfer success. 

Figure 1: Examples of ineffective rectangular boxes versus effective square boxes for character capture. 

This is just one factor to consider when streamlining form processing within an insurance technology application. To explore more research on this topic, download the Best Practices: Improving ICR Accuracy with Better Form Design whitepaper.  

3. Confidence Value Reporting for Data Recognition

Not all optical character recognition technology is created equal. That’s why it’s important to make sure any solution you either create internally or partner with a third party to integrate provides ongoing confidence value reporting for data recognition. Having this capability in place can alert you to problems before they lead to costly issues — like duplicated efforts, a poor customer experience, or incomplete data hindering contract processing. 

4. Use OCR to Identify Different Documents

Optical character recognition (OCR) can help insurance companies cut down on manual effort by identifying different forms automatically, which equips application developers like you to create automation within your company’s product that routes identified forms through predefined workflows. 

Without OCR, significant manual effort is required to process forms required to execute insurance contracts. When evaluating OCR capabilities to add to applications, keep in mind these essentials:

  • Successful Character Recognition Rates – Given the highly regulated nature of insurance along with high fines for shortcomings, it’s often well worth the extra investment to get a solution with 99% accuracy versus 95%. 

 

  • Multi-Document Recognition with High Confidence Values– Given the broad number of file types insurance organizations receive, having a software package in place that cleans up documents before running them through optical character recognition tools improves the likelihood of extracted data being usable. With cleaner data in hand, insurance agents are empowered to make better recommendations to customers, ensuring they’re not over or under insured.

These are just a few items to consider when adding document viewing and forms processing features to your application. While automated workflows may have given organizations heartburn in the past, the reality is that high-volume, fast-changing environments can’t survive without them. Markets are changing so quickly that without automation to help bring order to the chaos, the tidal wave of requests will overtake the underprepared. 

Help your clients better respond to not only COVID-19, but also future-proof their ability to streamline claims by expanding document viewing and form processing capabilities. To learn more about our insurtech capabilities, explore our content solutions for insurance companies.      

Question

What quality should my images be for processing form data and recognition using FormSuite?

Answer

In all cases, you want to have your images as clear and as clean as possible. For any particular procedure, please consider the following:

OCR and ICR: Capture images in at least 300 DPI resolution. Ideally, working in black and white allows the objects of interest on your image to be better defined and recognized. Free the image form all noise as much as possible. As if a human were reading it, you want the text objects on the image to be as legible as possible. For ICR, ensure that the characters are printed (no cursive text, etc).

Barcode recognition: As with OCR and ICR, capture images in at least 300 DPI and working with black and white content can provide excellent results. Ensure that the bars in the barcodes are clearly defined on the image and are not malformed (for example, the barcodes should have the proper start and stop sequence, etc). Clear as much noise from the image as possible.

Forms matching and registration: As with the prior 2 items above, capture your documents in at least 300 DPI. Ensure that your resolution is consistent between your form templates and incoming batch images. Form templates should only contain data that is common to every image that is being processed (i.e. Form fields, the text that appears on the blank form itself, etc). The template should not have filled-in field information as this will affect the forms matching process.

Question

I have a PDF of a form that I’m sending to PrizmDoc to have it auto-detect, but PrizmDoc does not find any fields in the document. What would cause this?

Answer

Currently only PDF files with embedded AcroForms will be auto-detected. If the PDF document
has an embedded image of a form, PrizmDoc will not find any results from auto-detection.

Question

We just installed the PrizmDoc client and noticed that when we run Prizm Application Services, the service states that it is started, but it is not listening on 3000, and there are no logs written to the Prizm\logs\pas folder. What might the issue be?

Answer

A possible reason for this issue can be due to the Windows system environment variable called PATHEXT. By default, .COM;.EXE;.BAT;.CMD;.VBS;.VBE;.JS;.JSE;.WSF;.WSH;.MSC are configured.

What this variable does is allow commands to be executed without needing to add the extension to them.

For example, PM2, which runs with PAS, has a file extension of .CMD, so when executing PM2 in a command line you just need to type PM2 instead of the full name PM2.CMD.

If .CMD is not in the PATHEXT environment variable, then just typing PM2 would return a command not found error, and you would need to use the full PM2.CMD for it to work.

To fix the issue, ensure the following:

  1. Open Control Panel > All Control Panel Items > System
  2. Select Advanced System Settings
  3. Select Environment Variables
  4. Under System Variables double-click PATHEXT
  5. Add to the end of the current string
  6. Restart Prizm Application Services
Question

Why is touch input for PrizmDoc Viewer not working in Chrome version 70+?

Answer

PrizmDoc Viewer uses the Chrome Touch API to process touch input. This API was deprecated after Chrome version 70, and must be manually re-enabled in order for touch input to work in PrizmDoc.

For Chrome versions 70-77:

Paste the following link into Chrome to enable the Touch API:

chrome://flags/#touch-events

This issue will also occur in the Chromium version of Microsoft Edge.

Paste the following link into Edge to enable the Touch API:

edge://flags/#touch-events

For Chrome Versions 78+:

As of version 78 of Chrome, the touch-events flag has been removed from chrome://flags/ and edge://flags/.

To enable touch-events in versions of Chrome 78 and later, you must set a command-line flag at the end of the target path in your Chrome shortcut properties as demonstrated below:

"C:\Program Files (x86)\Google\Chrome\Application\chrome.exe" --touch-events=enabled
Question

I have installed PrizmDoc based on the documentation against a clean CentOS 7/RedHat 7 system, and Prizm services starts and is showing healthy. However, one of two issues are occurring:

  1. I cannot view HTML or picture files but can view PDF files.
  2. I cannot view PDF, Excel, or Word documents but can view HTML and Picture files.
Answer

If you cannot view HTML or picture files but can view PDF files, it is often due to specific required libraries not being installed. The following procedure can be executed on CentOS/RedHat 7 to ensure all required PrizmDoc libraries are installed.

  1. Stop the Prizm service: sudo /usr/share/prizm/scripts/pccis.sh stop

  2. Copy and paste all of the library installers into a terminal and wait for them to finish:

    yum install -y libbz2* libc* libcairo* libcups* libdbus-glib-1* libdl* libexpat* libfontconfig* libfreetype* libgcc_s* libgif* libGL* libjpeg* libm* libnsl* libopenjpeg* libpixman-1* libpng12* libpthread* librt* libstdc++* libthread_db* libungif* libuuid* libX11* libXau* libxcb* libXdmcp* libXext* libXi* libXinerama* libxml2* libXrender* libXtst* libz* linux-vdso*
    
  3. Restart the server.

If you cannot view PDF, Excel, or Word documents but can view HTML and Picture files, this is often due to installing the Generic PrizmDoc installer, which ends in either client_x86_64.tar.gz or server_x86_64.tar.gz. To resolve this issue you will need to re-install using the links that end in client_x86_64.rpm.tar.gz and server_RHEL7.tar.gz.

Question

The logging for ImageGear C & C++ Deployment Packaging Wizard (DPW) is showing different output for some components since v19.3, why is this?

In ImageGear C & C++ v19.2 and prior, the DPW had additional logging information for the ARTX component in its deployment.log:

Deploying an application that uses the ARTXGUI library of ImageGear
ARTX Component requires the following merge modules to be installed:

Microsoft_VC90_CRT_x86_x64.msm

Microsoft_VC90_MFC_x86_x64.msm

But since v19.3, the logs are no longer telling me to install these modules. Is this a mistake, or are they no longer necessary?

Answer

This was an intentional change on our end, and the Deployment Packaging Wizard (DPW) is working as intended. We made some updates to the DPW in the latest release; one update is that the CRM requirements for CORE (which is required in every project) now also covers the ARTX component. If the DPW is not saying you need additional components to use the ARTX component, then you’ll be fine.

The last twelve months have seen an unprecedented shift in the way organizations and customers are utilizing digital services. According to data gathered by McKinsey in 2020, digital adoption made roughly five years worth of progress in a span of eight weeks at the onset of the COVID-19 pandemic. While this massive shift impacted almost every industry, the government sector in particular faced tremendous disruption as its legacy systems struggled to keep pace with demand.

Many of the changes in the way people access government services are likely to remain in place even after the threat of the pandemic recedes, which creates a huge opportunity for software developers specializing in GovTech applications. A closer look at GovTech trends for 2021 provides some insight into those opportunities.

5 Key GovTech Trends to Watch in 2021

1. Remote Functionality 

Government agencies had to fundamentally rethink the workplace in response to the pandemic. Non-essential personnel transitioned to working remotely whenever possible, but this move created a number of challenges in terms of collaboration and security. Employees still need to be able to view, edit, and share files without compromising privacy or creating version confusion. All too often, remote workers resort to ad hoc solutions involving third party programs and conventional email, all of which make it incredibly difficult for an organization to maintain control over its essential files. GovTech developers can address these challenges directly by building software that facilitates remote collaboration entirely within a secure application.

2. Doing More with Less

One of the downstream consequences of social distancing restrictions and stay at home orders has been the erosion of sales tax revenue at the state and local level. While the impacts have not been as catastrophic as originally feared, many states are still facing significant budget shortfalls despite making deep spending cuts. The pressure will be on to find GovTech solutions that are easy to implement, use, and maintain. Efficiency and flexibility will continue to be important considerations as state and municipal governments seek out platforms that can address multiple needs and allow them to eliminate costly redundancies.

3. Shift to Digital

When government offices were forced to shut their doors in the early days of the pandemic, they had to scramble to find ways to deliver services digitally. This was especially difficult for agencies relying on legacy infrastructure and outdated software, but the transition to digital is unlikely to slow down anytime soon now that it’s underway. According to a recent study, 61 percent of government officials surveyed believe that the pandemic has accelerated their digital transformation goals, while 75 percent claim that their agency is pushing to offer even more services digitally. That will mean plenty of opportunity for innovative GovTech developers that can provide the automation and data management tools governments need to bring their services into the 21st century.

4. Fight for Privacy

Government agencies sit upon massive amounts of private data that must be kept secure at all costs. From personally identifiable information like Social Security Numbers to contracts and applications that contain confidential business data and vital trade secrets, governments have a responsibility to protect sensitive data at all times. They need systems and software that not only keeps files safely within the secure confines of an application, but also provides the redaction capabilities that allow agencies to comply with information requests. By designing platforms that promote transparency while also protecting privacy, GovTech developers can play an important role in building trust between government and citizens. 

5. Citizen-Centric Experience

The combination of evolving public expectations and demographic change was rapidly reshaping the delivery of government services even before the pandemic. In a global survey conducted in late 2019, Accenture found that 50 percent of respondents believed that requests to an agency could be resolved faster with the use of AI assistants or chatbots and that a transition to 24/7 access to government services would be greatly beneficial. Respondents also wanted easier access to their personal information (74 percent), faster response times (73 percent), and greater visibility into the status of their queries and applications (64 percent). Younger citizens accustomed to customer-centric experiences are further shifting expectations of what services the government should be able to offer digitally. It will fall to GovTech developers to design applications that connect citizens to their government and streamline processes that have long relied upon inefficient manual practices and direct physical interactions.

Enhance Your GovTech Application with Accusoft Solutions

Working with the government sector presents a number of challenges to even seasoned developers. From meeting complex compliance and privacy requirements to managing a dizzying range of document types, building and implementing an effective solution takes a great deal of time and development resources.

One of the easiest ways to speed up that process is by incorporating proven functionality into an application with SDKs or APIs. Accusoft’s collection of software integrations helps GovTech developers get to market faster by providing reliable and government-ready content processing features.

  • PrizmDoc Viewer: A powerful HTML5 viewer with annotation and redaction capabilities, PrizmDoc Viewer makes it easy to view, edit, and manage public records, contracts, and even more sensitive documents all within a secure GovTech application.
  • ImageGear: With ImageGear’s extensive image processing, conversion, and compression features behind them, GovTech applications can easily improve document workflows, consolidate information, and meet government archiving standards (thanks to PDF/A support).
  • FormSuite: Processing government forms can quickly overwhelm an application if it doesn’t have the capabilities to handle multiple form types or clean up document images. FormSuite for Structured Forms is a collection of forms processing SDKs that helps GovTech applications quickly sort and extract data from structured forms for superior speed and accuracy.

As GovTech trends continue to accelerate in 2021, developers need partners they can trust to provide secure, reliable functionality to their applications so they can focus their efforts on building software that meets the exacting needs of the government sector. Learn more about how Accusoft can fulfill that role and elevate the potential of GovTech applications.