Technical FAQs

Question

What type of license should I be using?

Answer

Accusoft supports 3 primary license types: Evaluation, Toolkit, and Runtime licenses.

Evaluation License:

An Evaluation license is available upon download and will allow you to try our products and determine if they will suit your needs. Most downloads include free demo applications that demonstrate the core functionality of our product and will show you how to get up and running integrating Accusoft products with your solution. While the Evaluation license is free to download it does come with some restrictions. These restrictions may come in the form of water marked documents, pop-up dialogue boxes, or a time limit on the Evaluation license itself.

Toolkit License:

Should you decide that an Accusoft product is right for you, the next step should be to purchase one or more of our Toolkit licenses. Accusoft requires each developer using an Accusoft component to have a Toolkit license on their development machine. Once registered, the Toolkit license will unlock all restrictions of the Evaluation license. Please note, the Toolkit license is strictly to be used for development purposes, and when activated, it will be tied the specific development machine that it was registered on. Understandably, the occasion may arise where a Toolkit license needs to be moved from one development machine to another. In such a scenario, please contact sales@accusoft to see if this is feasible. For more information on how to register a Toolkit license on a development machine please see the ‘Toolkit Licensing’ section of your products Help documentation.

Runtime License:

Finally, once you have completed your solution and are ready for deployment, you will need to purchase a Runtime license. Accusoft offers 2 types of Runtime licenses. Most commonly, you will purchase an Automatically Reported Runtime (or Node-Locked) license, which requires a license for every machine your solution will be deployed to. It is very important to note that Runtime licenses are tied to the machine they are activated on and should not be used for development purposes. Doing so will quickly use up your Runtime licenses and cannot be refunded. For more information about how to activate an Automatically Reported Runtime license, please see the ‘Automatically Reported Runtime License’ section of your products Help documentation. In certain circumstances, Accusoft will issue a Manually Reported Runtime licenses that allow you to embed all of the licensing information directly into your application. These licenses are only distributed under certain circumstances and eligibility is determined by Accusoft sales. For more information on how to activate this type of license, please see the ‘Manually Reported Runtime License’ section of your products Help documentation.

The simultaneous development of Pfizer and Moderna’s safe and effective COVID-19 vaccines in less than a year stands as one of the great feats of recent medical science. Now that the vaccines are available, however, the healthcare industry and government authorities must take on the new challenge of distributing doses to the population quickly and effectively. In some respects, this logistical feat will be every bit as daunting as developing the vaccines themselves.

Fortunately, the use of barcoding in healthcare supply chains and patient records will prove incredibly helpful in overcoming some of the key difficulties in vaccine distribution. Medical barcodes are already being used in many essential applications. For organizations that have yet to fully embrace the potential of digital transformation, barcode processing integrations can help them quickly expand their capabilities to meet the growing demands of vaccine delivery.

4 Ways Medical Barcodes Solve Vaccine Delivery Challenges

1. Better Supply Chain Accuracy Means Less Waste

Given the high costs of manufacturing and distributing the vaccines, there is justifiable concern over the potential for waste. Both versions of the vaccine need to be kept at low temperatures for shipping after manufacture (approximately -90 degrees Fahrenheit for Pfizer and about -10 degrees for Moderna). Once they’re moved to a refrigerator for administration, they cannot be refrozen. While the Moderna vaccine can last for up to 30 days refrigerated (provided the vial is not punctured), the Pfizer vaccine must be discarded after a mere six hours. Further complicating matters, each Pfizer thermal shipping container can potentially hold up to 975 multidose vials (4875 individual doses), whereas each box of Moderna vaccine contains 10 vials (100 doses).

Without accurate inventory and shipment tracking, healthcare providers could easily end up with too much supply in one location and not enough elsewhere. In a worst case scenario, unused doses might even go to waste because they can’t be redirected to another site quickly enough. By incorporating medical barcode scanning throughout the supply chain, healthcare organizations can ensure more efficient distribution during the shipping process. They can also verify that delivery sites have the appropriate storage capacity ahead of time to avoid the possibility of doses going to waste due to lack of freezer space.

2. Improved Dosage Records

One of the key challenges with distributing the currently approved vaccines is that they require multiple doses. Although the doses are identical from a chemical composition and dosage standpoint, the problem is that they must be administered after a specified interval. According to the FDA, that interval is approximately 21 days for the Pfizer vaccine and 28 days for the Moderna vaccine. As healthcare providers work to deliver the vaccine effectively, they must keep accurate records to show who has received the first dose and how much supply of each vaccine shipment should be designated for second doses.

The ability to read and print barcodes providers quickly track where patients are in the vaccination process and ensure that second doses will be available at the appropriate time. This is especially important considering that the vaccines are not interchangeable. Once someone has received the first Pfizer dose, for instance, they should not receive the Moderna vaccine for their second dose (except in exceptional circumstances). By generating a specific barcode after the initial dose and including it with a patient’s health records, providers can quickly and easily match people with the correct vaccine and make sure they have available doses on hand.

3. Keeps Essential Medical Equipment On-Hand

Vaccine distribution involves more than just shipping the doses themselves. Many different accessories are required to administer the vaccine, including protective equipment, vials, rubber stoppers, syringes and needles, and alcohol swabs. Healthcare supply chains were already under significant strain throughout the pandemic, so it should not be taken for granted that providers will have everything they need when the vaccine arrives. Furthermore, as the overall pace of vaccinations increases, it will be important to keep an accurate count of available equipment, especially if a provider does a lot of off-site vaccinations.

Barcoding in healthcare is critical to establishing connections between different elements of the supply chain. By using medical barcode integrations, providers can track and coordinate every piece of equipment needed for vaccine delivery in near-real time. Incorporating the same barcodes into patient records also gives a more up-to-date inventory count as doses are administered, ensuring that hospitals and healthcare facilities don’t run out of essential equipment when they need it most.

4. Expands Distribution Beyond Traditional Supply Chain

Distributing the vaccine in major population centers is difficult enough, but extending delivery into underserved rural areas presents a different set of challenges. These areas often lack the supply chain infrastructure to accommodate the rapid and widespread transfer of medical products. Healthcare providers will need technology tools that allow them to set up remote distribution and treatment centers capable of coordinating with local communities in order to extend their reach into these areas.

While barcoding in healthcare may provide the visibility organizations need into vaccine logistics and patient records, certain regions will also require mobile medical barcode integrations that can put more power and control into the hands of field workers. Rugged, reliable barcode integrations capable of reading broken or damaged barcodes using any mobile device will be essential for overcoming the limitations of rural digital infrastructure.

Unlock the Potential of Barcoding in Healthcare with Barcode Xpress

Accusoft’s Barcode Xpress SDK integration helps healthcare applications read, write, and detect more than thirty different barcode types, even if those images are damaged, broken, or incomplete. With the ability to read multiple barcodes at speeds of up to 1,000 pages per minute, Barcode Xpress can help medical providers take control of their supply chains and manage patient records more efficiently. That same functionality can be extended even further thanks to Barcode Xpress Mobile, which can turn any iOS or Android device into a powerful barcode scanner.

Distributing COVID-19 vaccine doses is one of the great logistical undertakings of the 21st century. By expanding the usage of barcoding in healthcare, providers can create greater transparency into their supply chains to reduce waste and deliver the vaccine more efficiently to the patients who need it most. Find out how Accusoft’s Barcode Xpress can help the medical industry upgrade its infrastructure to meet the challenge of restoring a sense of normalcy to people’s lives and overcoming the pandemic. Try a hands-on demo of our barcode SDK today.

TOTVS Fluig, a leading business management systems provider, improved document viewing by integrating Accusoft’s PrizmDoc solution, addressing challenges with Microsoft Office files. PrizmDoc offered advanced features like text search and annotation. Transitioning to an Accusoft cloud environment ensured scalability and compliance. With PrizmDoc, TOTVS Fluig efficiently handles millions of viewing requests, focusing on core products while relying on Accusoft. Their partnership promises future growth and innovation.

OCR form

An automated forms processing solution can significantly improve accuracy and efficiency when it comes to managing large quantities of documents containing structured content. Whether an organization needs to digitize existing records or is continuously processing new documents within application workflows, having a versatile optical character recognition (OCR) component working to identify and extract text from multiple languages allows them to capture data more effectively.  Solid OCR form capture is critical.

Although a good OCR engine operates quickly and efficiently, the process of recognizing and extracting text is a highly complex undertaking that can be impacted by a variety of factors. Under optimal conditions, for example, the OCR component within Accusoft’s FormSuite can generate results quickly and accurately, with the ability to read several languages from around the world. However, if an application’s forms processing workflow is not set up efficiently or overlooks a few important considerations, recognition performance may suffer in terms of speed and accuracy.

6 Ways to Achieve the Best Results with the Accusoft OCR Component in FormSuite

 

1. Pay Attention to Image Resolution

As a general rule, OCR components should be provided with high resolution images so the recognition engine is able to distinguish the details that would otherwise be missed on low resolution images. This helps them to recognize the differences between “l” and “i” or “O” and “0” (zero), which results in better, more accurate results.

However, there could be a problem if the image resolution is too high. These images require much more time to process without delivering any benefits since the required letter properties are clearly distinguishable in a lower resolution.

To strike a balance between speed and accuracy, it’s better to scan all images in a 150-400 dots-per-pixel range. This allows the recognition engine to identify all possible letter properties and avoid being bogged down with analyzing a lot of data at the same time.

2. Don’t Lose Image Properties While Preparing to Recognize

To achieve the best results, it’s important to provide the recognition engine with a few helpful hints. In some cases, resolution properties may be lost while an image is being prepared for recognition, leading to worse than expected results. This happens most frequently when working with System.Drawing.Image or SystemDrawing.Bitmap classes directly during operations like clipping, merging, or reducing the bit depth. 

In this case, the best solution is to make sure that HorizontalResolution and VerticalResolution properties are set correctly and reflect initial image resolution values. The ScanFix component within FormSuite can perform this task automatically and is designed to be compatible with the OCR component to help achieve better recognition results.

3. Clean Up Underlined Text Before Recognition

Specks, dirt, and other imperfections within the source image can significantly reduce recognition quality. Sometimes, however, even a seemingly good image can be recognized incorrectly when there are underlined words like URLs, emails, or specifically formatted generic text. 

From the software’s point of view, this kind of text isn’t very different from other types of image distortion. ScanFix’s LineRemovalOptions can clean up the text by eliminating lines that could interfere with recognition. The API also features special parameters that ensures characters with low hanging elements (such as “j” or “y”) will be restored after line removal to avoid another potential recognition problem.

4. Use Long-Living Objects to Avoid Recognition Performance Drop

Creating a new instance requires OCR engine initialization and loading neural network data suitable for specific recognition parameters. This process is not resource free because of the data complexity and may cause delays from ~200 msec to 2 sec depending on the hardware and recognition properties. 

Existing Accusoft OCR instances may be reused to recognize other images with different properties. This will speed up the overall process because initialization will be done only once during the first AnalyzeField call and subsequent calls will be much cheaper in terms of computing resources.

5. Assign Instances to Their Own Worker Threads

Objects are thread safe and can be called from different threads. However, assigning an object to its own thread can avoid extra locking. One of the simplest ways to do this is to use C# Parallel.ForEach loop and create ConcurrentQueue with the pre-allocated objects. 

This ensures that the number of threads will not exceed the number of available CPUs. Any available instance can then be automatically assigned to recognize the images in their own thread while extra possible threads will wait until busy instances will be free to acquire.

Other common patterns are producer-consumer and map-reduce, which are more complex to implement but provide better flexibility when managing input data.

6. Dispose Objects to Avoid Memory High Memory Consumption

This is a generic rule for the C# to call a Dispose for the objects which use non-managed resources. FormSuite’s OCR component uses an external recognition engine, so it is highly recommended to call Dispose when the instance will not be required anymore. This can avoid a situation where the memory will not be available for different parts of the application, especially when a high amount of data exists for post-processing or the amount of available memory is low because of the different processes running in parallel.

Get Accurate OCR Data Capture Results with FormSuite

When properly configured and incorporated into a forms processing workflow, the FormSuite OCR component can accelerate automated data capture and reduce manual errors. Its zonal field recognition capabilities allow it to hone in on predefined field types to improve processing speed and accuracy. Developers can also adjust confidence values for recognition results to determine how frequently manual review is necessary. 

To get a hands-on look at how FormSuite incorporates OCR seamlessly into its collection of forms processing tools, schedule a free trial today.

On July 23, 2021, Accusoft announced the latest update to SmartZone, its versatile data capture SDK for forms processing. The version 6.1 update expands the integration’s already robust language support for Optical Character Recognition (OCR) by adding 65 new languages, with a special emphasis on Central European, Cyrillic, Baltic, and Asian language groups.

“We have a strong customer base in North America with SmartZone, but we’ve received multiple customer requests to expand support to a global range of languages,” said Irina Tabac, Product Manager for Forms and Cloud APIs at Accusoft. “This update will help developers serve their international customers better and offer OCR support for new customers around the world.”

Newly supported languages include: 

  • Turkish
  • Malay
  • Russian
  • Polish
  • Greek
  • Chinese
  • Korean
  • Japanese
  • Thai
  • Vietnamese
  • And over 50 more…

SmartZone v6.1 also expands the SDK’s OCR capabilities to include full page recognition for forms processing. Now developers can give their forms processing applications the ability to capture data using targeted zonal recognition or by extracting text from the entire page.

Available as an individual SDK or as part of the FormSuite Collection, SmartZone’s data capture technology eliminates the need for manual data entry in forms workflows. To learn more about Accusoft’s SmartZone OCR/ICR SDK or download a trial for a first-hand look, please visit our website.

About Accusoft: 

Founded in 1991, Accusoft is a software development company specializing in content processing, conversion, and automation solutions. From out-of-the-box and configurable applications to APIs built for developers, Accusoft software enables users to solve their most complex workflow challenges and gain insights from content in any format, on any device. Backed by 40 patents, the company’s flagship products, including OnTask, PrizmDoc™ Viewer, and ImageGear, are designed to improve productivity, provide actionable data, and deliver results that matter. The Accusoft team is dedicated to continuous innovation through customer-centric product development, new version release, and a passion for understanding industry trends that drive consumer demand. Visit us at www.accusoft.com.

###

On June 25, 2021, Accusoft announced the latest update to PrizmDoc Viewer, its industry-leading HTML5 document viewing and image processing integration. The version 13.17 update improves a number of existing features and adds key functionality, allowing developers to offer even more versatility within their applications.

“Staying still is never an option,” says Mark Fears, Director of Product Management at Accusoft. “We’re always exploring ways to improve our existing products to help software developers do more for their customers. This update incorporates a lot of feedback we’ve received from our partners.”

In addition to improving rendering fidelity and redaction functionality, version 13.17 also makes significant improvements to PrizmDoc Viewer’s Content Conversion Service (CCS):

  • Convert Markup Changes in Microsoft Word: When converting DOCX files into another file format with CCS, you can now choose to convert accepted or rejected markup changes along with the document text.
  • Convert Speaker Notes in Microsoft PowerPoint: When converting PPTX files into another format using CCS, you can elect to convert the slides only or include any speaker notes in the new document.

Incorporated into thousands of applications worldwide to deliver responsive file viewing and conversion, PrizmDoc Viewer supports multiple proprietary and open file formats. Its powerful REST APIs allow developers to integrate the annotation, redaction, and document comparison features today’s businesses are looking for into their software.

To learn more about Accusoft’s PrizmDoc Viewer or download a trial for a first-hand look, please visit our website.

About Accusoft: 

Founded in 1991, Accusoft is a software development company specializing in content processing, conversion, and automation solutions. From out-of-the-box and configurable applications to APIs built for developers, Accusoft software enables users to solve their most complex workflow challenges and gain insights from content in any format, on any device. Backed by 40 patents, the company’s flagship products, including OnTask, PrizmDoc™ Viewer, and ImageGear, are designed to improve productivity, provide actionable data, and deliver results that matter. The Accusoft team is dedicated to continuous innovation through customer-centric product development, new version release, and a passion for understanding industry trends that drive consumer demand. Visit us at www.accusoft.com.

The ability to watermark documents is essential for many businesses. By utilizing watermarks, organizations can quickly brand their assets and take steps to protect their valuable intellectual property. When identifying mission critical document features for their applications, developers need to keep in mind all the ways that watermarks continue to be deployed for a variety of use cases.

What Is a Watermark?

A watermark is a semi-transparent line of text or an image that is embedded into a file to indicate that it belongs to a person or organization and cannot be reproduced or distributed without permission. In most instances, watermarks are visible, but some text-based watermarks are embedded in such a way that they only become visible after the document is scanned or printed.

Digital watermarks are one of the earliest and most effective forms of security for documents and images. They are most commonly used by photographers who want to protect their images from unauthorized usage. Their watermark is superimposed over the image, making it difficult to remove without also damaging the source image in the process. 

The same basic principle applies to documents, especially PDF files that are easy to download, view, and distribute. Many organizations continue to add specialized watermarks to their documents to protect both confidential information and intellectual property. Understanding which types of watermarks to use and why is an important aspect of effective document security.

Why Should You Watermark Documents?

There are a few reasons why an organization may want to watermark documents. Although they’re most often associated with security, they can be used for other purposes as well. Here are a few major use cases involving document watermarks:

Classify Documents More Easily

File management can be a challenge for organizations of all sizes. While steps like naming conventions can help somewhat, it can be difficult to keep track of which files can be shared openly and which ones should only be opened and viewed within a secure application environment. Watermarks can be added to documents to indicate how they should be handled. A contract that’s labeled “Confidential,” for instance, makes it clear to the user that they should follow the proper security precautions when viewing the file, such as not leaving the screen open unattended or not sharing their screen with another participant on a video call. It also removes any ambiguity about the nature of the document.

Track Your Documents

While many free watermarking tools only allow users to add a generic logo or text to a document, dedicated viewing solutions like PrizmDoc Viewer can create watermarks that contain unique identifiers, making it easier to track documents throughout their lifecycle. This makes it easier to track down the source of a leak or breach if a file ends up somewhere unexpected. In addition to the security benefits, customized watermarks that include a user name, time stamp, or revision number helps organizations maintain better version control over their documents and avoid any confusion over which file is the most up-to-date.

Discourage Unauthorized Distribution

Watermarks are an easy way to indicate ownership of a document and its contents. While many files, and PDF files especially, are protected by some form of digital rights management (DRM) that limits who can open, edit, or copy them, this form of security is often circumvented by inventive tactics like taking a screenshot, capturing video footage, or even photographing a screen with an external device. 

If the captured document has a watermark, it will be quite difficult for someone to distribute the file without revealing the owner. While it’s possible to remove a watermark without damaging the document contents, doing so can be a difficult and time-consuming process that often isn’t worth the effort. More importantly, if the watermark contains identifying information, it’s often possible to find out where a security breach occurred and then locate those responsible for it.

How to Watermark Documents

While many organizations understand the value of adding watermarks to their documents, they often don’t have the tools in place to do so effectively at scale. There are a number of free watermarking web applications available, but these sites typically require a file to be uploaded first, which creates a huge security risk for most businesses. In many cases, this process is also irreversible because it burns the watermark into the document itself. 

Microsoft Word can easily add watermark text or images to a document, but they’re also quite easy to remove. In fact, a quick Google search points to multiple resources for removing Word-based watermarks from both DOCX and PDF files.

PrizmDoc Viewer supports multiple forms of document watermarks natively within an application, including identifying text, diagonal text, and images. More importantly, since the watermark is associated with the document viewing session, it isn’t applied to the source document itself, which remains secure within the main application. Watermark document properties can be specified to include more than one watermark in a viewing session, even supporting mixed types on different areas of the page.

Text Watermarks

The watermark API for PrizmDoc Viewer allows users to adjust the location, size, and style of the text. Information included can be adjusted based on document needs.

Diagonal Text Watermarks

PrizmDoc Viewer can apply a single directional text watermark across the center of the page. By including control characters in the text string, it’s possible to create line breaks in the text.

Image Watermarks

The watermark API can also add a transparent PNG file to the document, which is ideal for displaying a company logo. PrizmDoc Viewer can adjust the opacity of the watermark regardless of whether it’s an image or text.

By adding multiple watermarks, organizations can easily brand their files across the company while also providing effective tracking information for individual documents.


Start Watermarking Documents Today with PrizmDoc Viewer

A fully-featured HTML5 viewer, Accusoft’s PrizmDoc Viewer is easily integrated into web applications in need of comprehensive viewing and document management capabilities. In addition to watermarking, it delivers conversion, OCR, annotation, and redaction support for more than 100 different file types. Get a hands-on view of how easily PrizmDoc Viewer can apply custom watermarks to your documents with our watermarking API demo.

scalable vector graphics

The scalable vector graphic (SVG) format continues to enjoy steady adoption across the web. According to data from W3Techs, SVG now accounts for 25 percent of website images worldwide. But it wasn’t always this way. In 1998, it became apparent that vector-based graphics had a future on the web, and the W3C received six different file format submissions from technology companies that year. Some were mere proposals ready for a complete revamp, while others were proprietary products that W3C wasn’t permitted to modify. Instead of forging a format from one of the submissions, however, W3C’s SVG working group decided to start from the ground up — and SVG was born.

While the file format had lofty ambitions, focusing on common use rather than specific syntax, the original iteration was cumbersome and complex. However, SVG has improved year after year after year. With increased support came more streamlined functionality and usable features. Now, SVG is often the first choice for meeting the evolving demands of scalable, responsive, and accessible web content.


What is a Scalable Vector Graphic (SVG) and how does it work?

Today, SVG is the de-facto standard for vector-based browser graphics. But what exactly is this file format, and how does it work?

Based on XML, SVG supports three broad types of objects: 

  • Vector graphics including paths and outlines that are both straight and curved
  • Bitmap images such as .jpeg, .gif, and .png
  • Text

What sets SVG apart from bitmap-based images is the use of lines and curves along the edges of graphical objects. Because bitmap images use a fixed set of pixels, scaling them up creates blurriness where the edges of pixels meet. In the case of vector images, meanwhile, a fixed-shape approach allows the preservation of smooth lines and curves no matter the image size.

SVG also offers the benefit of interoperability. Because it’s a W3C open standard, SVG plays well with both other image format and web markup languages including JavaScript, DOM, CSS, and HTML. This allows the format to easily support responsive design approaches that scale websites and web content based on the user device rather than defining standardized size parameters. Thanks to the curves and lines of SVG, scaling presents no problem for responsive designers looking to ensure consistency across device types.


The Benefits of SVG

While scalability is often cited as the biggest benefit of SVG, this format also offers other advantages, including:

  • Responsiveness — Images can be easily scaled up or down and modified as necessary to meet web design and development demands.
  • Accessibility — Since SVG is text-based, content can be indexed and searched, allowing both users and developers to quickly find what they’re looking for.
  • Performance Image rendering is quick and doesn’t require substantive resources, allowing sites to load quickly and completely.
  • Use in Web ApplicationsBrowser incompatibilities and missing functions often frustrate web design efforts, forcing developers to use multiple tool sets and spend time checking content and images for potential format conflicts. SVG, meanwhile, offers powerful scripting and event support, in turn allowing developers to leverage it as a platform for both graphically rich applications and user interfaces. The result? Better-looking sites that enhance the overall user experience.
  • InteroperabilityBecause SVG is based on W3C standards, the format is entirely interoperable, meaning developers aren’t tied to any specific implementation, vendor, or authoring tool. From building their own framework from the ground up to leveraging third-party SVG applications, web developers can find their format best-fit.

SVG in PrizmDoc Viewer

Accusoft’s PrizmDoc Viewer offers multiple ways for developers to make the most of SVG elements at scale, such as:

  • File TransformationConversion is critical for effective and efficient web design. If development teams need different file transformation tools for every format, the timeline for web projects expands significantly. PrizmDoc Viewer streamlines this process with support for the conversion of more than 100 file types — including PDFs, Microsoft Office files, HTML, EML, rich text, and images — into browser-compliant SVG outputs. In practice, this permits near-native document and image rendering that’s not only fast, but also accessible anytime, anywhere, and from any device.
  • HTML5 FunctionalityUsing SVG in PrizmDoc Viewer is made easier thanks to native HTML5 design. The use of HTML5-native framework not only improves load times with smaller document sizes but means that PrizmDoc Viewer works in all modern web browsers — while also dramatically enhancing document display quality.
  • Pre-Conversion One of the biggest challenges with viewing large documents in a browser is delay. Pages toward the end of the document may take longer to load and frustrate users looking to quickly find a specific image or piece of information. PrizmDoc Viewer solves this problem with a pre-conversion API that returns the first page as an SVG while the rest of the document is being converted, allowing users to interact with documents as conversion takes place and lowering the chance that files will experience format-based delays.

SVG hasn’t always been the go-to web image format. Despite a promising start based on open, interoperable standards, the lack of early support and specific use cases for vector-based file formats saw SVG sitting on the sidelines for decades. 

The advent of on-demand access requirements and mobile-first development realities has changed the conversation. SVG is now continuously gaining ground as companies see the benefit in this scalable, streamlined, and superior-quality file format. Get the big picture and see SVG in action with our online document viewing demo, or start a free PrizmDoc Viewer trial today!

ocr optical character recognition

Effective document management is now a top priority for organizations, but for many, it remains a challenge. As noted by recent AIIM survey data, companies are struggling to handle both the documents they have and the rapid uptake of new information. In fact, 43 percent said their biggest priority is effectively leveraging the structured and unstructured content they already have, while 57 percent are focused on understanding the overwhelming big data.  Optical character recognition (OCR) is a critical component of document management.

For software development firms, this poses a particular challenge. Products are no longer feature complete without critical end-user functions such as advanced optical character recognition and powerful search. However, adding this functionality is not as easy as it sounds. Developers building out this comprehensive construct from the ground up requires both time, effort, and continued maintenance, which is a large undertaking for any company.

Accusoft’s ImageGear SDK offers a way to bridge the OCR gap with comprehensive image processing and manipulation capabilities that both streamline software development and deliver on end-user expectations.*


What is ImageGear?

ImageGear easily integrates into existing applications to deliver cutting-edge document management functionality at scale. Available for both .NET and C/C++ frameworks, ImageGear allows developers to quickly deploy and white-label key features including image processing, manipulation, conversion, and PDF and document search.

This add-on OCR functionality delivers highly-accurate optical character recognition to any .NET (C#) or C/C++ application. ImageGear’s OCR add-on provides full-page character recognition for more than 100 languages — including both Western and Asian languages such as Korean, Japanese, and Chinese character sets. It’s capable of recognizing multiple languages within a single image for enhanced document management. Other OCR features include:

  • Automatic page segmentation into individual zones for processing
  • Type assignment per zone based on defined flows, tables, or graphics
  • Table detection with advanced technology to enhance data reconstruction output
  • Entire page or individual region image processing
  • Zone definition by user, existing files, or detected automatically by the OCR engine

In addition, software developers can enhance ImageGear OCR functionality by leveraging both predefined and customizable dictionaries to ensure validated results using regular expressions. 


Why Optical Character Recognition (OCR) Matters to End-Users

Advanced OCR integration makes it easier for end-users to find what they’re looking for, when they’re looking for it. Instead of forcing users to find additional apps that deliver specific services, in-app OCR delivers increased satisfaction by streamlining user search functionality.

Common use cases include:

  • Legal eDiscoveryThe eDiscovery process is a critical — and often complex — stage of legal case preparation. Firms need to quickly find key terms, phrases, and images within legal documents to ensure they meet both client expectations and compliance obligations. With many forms now scanned and stored in non-standard file formats that contain form fields, text boxes, and digital imagery, OCR is essential to help lawyers streamline the process of eDiscovery at scale.

 

  • Financial Document ProcessingClients now expect loan applications and credit card applications to be processed at scale and speed. This is especially critical as firms embrace the idea of remote work — both staff at home and those in the office need end-to-end OCR functionality to deliver complete document management.

 

  • Insurance Documentation Assessment Insurance claims are both complex and comprehensive, requiring complete documentation from clients, contractors, and compliance agencies. As insurance firms move to tech-first frameworks to enhance document processing, speed, and accuracy, OCR makes it easy for staff to find specific data and ensure documentation is complete. 

Integrating OCR

Advanced OCR functionality won’t deliver expected outcomes if integration is cumbersome and complex. ImageGear streamlines this process with easy SDK implementation for both .NET and C/C++.

ImageGear .NET can be easily deployed on multiple platforms. These .NET deployments include ASP.NET functions such as image display, thumbnail display, annotation support, and cloud capture along with WPF printing and annotation support. ImageGear for C/C++, meanwhile, offers support for several platforms as well. Check out the developer resources section to see an updated list.


How Your Clients Use Optical Character Recognition (OCR)

PDFs remain the go-to file format for many industries, offering both standardized image and text conversion along with the ability to easily set or restrict document permissions. The problem? PDFs are notoriously difficult to search, making it hard for end-users to quickly find the text or data they need.

ImageGear makes it easy to OCR PDFs using the ImGearRecPage.Recognize Method, which leverages the zone list of the image to deliver accurate OCR — or, if this list is empty, automatically calls the page-layout decomposition process (auto-zoning) to complete the OCR process.

C# supports OCR to PDF.


using System.IO;
using ImageGear.Core;
using ImageGear.Formats;
using ImageGear.Evaluation;
using ImageGear.Recognition;

namespace ImageGearTest
{
    class Program
    {
        static void Main(string[] args)
        {
            // Initialize evaluation license.
            ImGearEvaluationManager.Initialize();
            ImGearEvaluationManager.Mode = ImGearEvaluationMode.Watermark;

            // Initialize the Recognition Engine.
            ImGearRecognition igRecognition = new ImGearRecognition();

            // ImageGear assemblies require explicit initialization at application startup.
            ImGearCommonFormats.Initialize();

            // Open a FileStream for our output document.
            using (FileStream outputStream = new FileStream(@"c:\temp\outputDoc.txt", FileMode.OpenOrCreate, FileAccess.ReadWrite))
            {
                // Open a FileStream for our source multi-page image.
                using (FileStream multiPageDocument = new FileStream(@"c:\temp\test.tif", FileMode.Open))
                {

                    // Load every page of the multi-page document. Starting at page 0 and loading the range of spaces specified.    
                    // Since the range is -1, that specifies that all pages shall be loaded.     
                    ImGearDocument doc = ImGearFileFormats.LoadDocument(multiPageDocument, 0, -1);

                    // Determine the amount of pages in the multi-page image.
                    int numPages = ImGearFileFormats.GetPageCount(multiPageDocument, ImGearFormats.UNKNOWN);

                    // Recognize each page of the multi-page document and add the results to outputStream.
                    for (int pageNumber = 0; pageNumber < numPages; pageNumber++)
                    {

                        // Cast the current page to a raster page and import that page.
                        using (ImGearRecPage igRecPage = igRecognition.ImportPage((ImGearRasterPage)doc.Pages[pageNumber]))
                        {

                            // Preprocess the page.
                            igRecPage.Image.Preprocess();

                            // Perform recognition.
                            igRecPage.Recognize();

                            // Add OCR results to the outputStream.
                            igRecognition.OutputManager.WriteDirectText(igRecPage, outputStream);

                        }
                    }
                }

            }
            // Dispose of objects we are no longer using.
            igRecognition.Dispose();
        }
    }
}

 


OCR Access and Analysis

Advanced OCR isn’t enough in isolation — developers must also empower end-users to quickly access and analyze OCR output. ImageGear offers multiple options to help streamline this process, such as:

  • Storage of Output as Code Pages
  • Export to Text Format
  • Export to PDF
  • Export to MRC PDF
  • Export to a Formatted Document

Find Your Best Fit

ImageGear OCR makes it easy for end-users to quickly search critical documents, find the data they need, and analyze optical character recognition output, but don’t take our word for it. Seeing is believing. Test ImageGear in your own environment and discover the difference of advanced OCR. 

*Optical character recognition is an ImageGear add-on and must be requested upon purchase of a license.