Technical FAQs

Question

Sometimes, when redacting an Office or PDF document, redactions drawn over certain content (such as an image or a logo) appears to get burned on other occurrences of the image on other pages. Why does this happen?

Answer

The reason why the duplicate redactions are occurring is because the images are shared images. In PrizmDoc, when a change is made to one instance of the shared image, it gets applied to every other instance. Per engineering, this is to mimic the behavior of Adobe Acrobat.

There currently exists a feature request to allow shared images to be treated as individual images so that they could be redacted separately:

https://ideas.accusoft.com/ideas/PDV-I-655

 

Far from just another tech industry buzzword, artificial intelligence (AI) is fast becoming a mainstay of data collection and analysis for many organizations. According to research by Accenture, not only do 84 percent of executives think leveraging AI is critical to meeting growth objectives, but three out of four of them believe they will risk going out of business if they don’t scale those initiatives.

That fear of being left behind is why 88 percent of companies have already invested in AI or machine learning technology or plan to do so in the near future. With some 175 zettabytes of data expected to be created in 2025, organizations without the AI data processing tools necessary to analyze and make sense of that data will struggle to develop effective business strategies and deliver a competitive customer experience.

It’s a tremendous opportunity for independent software vendors building the next-generation of applications across various industries. In order to deliver on the promise of AI, however, these software solutions also need to provide the tools that allow users to leverage their capabilities to streamline business processes. After all, a powerful AI solution isn’t of much use if it can’t be integrated with existing workflows.

Getting the Most Out of AI Data Processing

The most successful developers understand that AI data processing is only one piece of the puzzle. Their innovative AI technology is driving the car, but they still need the frame and wheels around it if the application is going to take their customers anywhere. That means building the less glamorous, but equally essential technology that helps AI data processing solve everyday tasks.

Take, for instance, document or image management. Organizations that gather data from physical forms or scanned documents need some way of extracting information so it can be converted into a format AI data processing tools can utilize. Manual data entry is both time-consuming and prone to error, so requiring users to transfer information by hand is simply not viable. By building document and image processing capabilities into their applications, developers can greatly enhance the versatility of AI data processing by automating key aspects of the collection process.

There’s also the question of what can be done with all of that data once it’s been gathered. Legal organizations, for example, often need to apply that information to contract creation, while insurance agents turn to it when assessing risk. By combining AI data processing capabilities with document assembly tools and search functionality, organizations can further automate key business processes to improve efficiency. Why painstakingly draft legal contracts or master service agreements from scratch when applications can use automation tools in conjunction with AI to assemble documents with greater speed and accuracy?

Build vs. Buy?

This often presents a challenge for software developers with limited resources. On the one hand, they need to invest as much time and energy as possible into their innovative AI data processing capabilities in order to meet the collection and analysis needs of their customers. But without also providing some way of interacting with and using that data to improve other key tasks, they will struggle to persuade potential users to adopt their innovative platform.

One solution is to build that functionality in-house. For software developers with substantial resources, this might sound like a good option. Unfortunately, the reality often proves less than ideal. Even something as basic as viewing and converting documents can quickly become a massive undertaking that draws valuable developer resources away from the AI data processing capabilities that are supposed to help the product stand out in a crowded market. 

In many cases, the company ends up having to outsource the work or push back key deadlines. Even worse, it may also end up creating more problems than it solves by relying on open source toolkits and libraries. The biggest problem has to do with security vulnerabilities. A recent study found over 2,600 bugs reported in open source projects between 2015 and early 2020. Even worse, many of these vulnerabilities were not formally reported to the National Vulnerability Database (NVD) until well after they were first exposed, giving hackers and other hostile actors time to exploit the security gaps.

The Integration Solution

Developers can avoid delays and security risks by turning to proven SDK and API integrations for their application needs. This is especially effective for complex, but essential functionality like viewing, conversion, compression, editing, and assembly. By relying on code-based integrations that are actively supported, they can ensure that users will be able to leverage their AI data processing solutions securely and effectively.

Rather than building features from the ground up and wasting valuable development resources, independent software vendors can devote more time and energy on the core competencies that will make their application more competitive. That allows them to build more powerful AI data processing capabilities and bring those features to market even faster.

Enhance Your AI Data Processing Application with Accusoft Integrations

Accusoft’s family of SDK and API integrations helps software developers realize the potential of their applications by delivering proven document and image processing functionality. Whether you need document assembly tools to get the most out of your legal AI sifters or powerful HTML5 viewing capabilities to harness the power of risk management automation, our easy-to-implement, code-based integrations can help you realize the full potential of your application’s AI data processing.

Find out more about how the Accusoft development team is incorporating machine learning into their processes or talk to one of our integration specialists today to learn how we can enhance your AI data processing application.

Printers, scanners, and other imaging devices have long been a source of headaches and frustration for developers and users alike. All too often, multiple software tools are required to connect an application to a device and acquire image files from them. This not only slows down workflows, but also creates opportunities for human error. Files can easily be misplaced or imported using the wrong parameters under these conditions.

Thanks to ImageGear’s TWAIN scanning support, however, developers can ensure that their application makes acquiring images from compatible devices both straightforward and mistake free. 

What Is TWAIN?

Developed in 1992 by a consortium of software developers and hardware manufacturers, the TWAIN standard is a standard software protocol and API that facilitates communication between imaging devices and software applications running on a computer. The word itself refers to a famous line in the Rudyard Kipling poem “The Ballad of East and West” that reads “never the twain shall meet.” Although sometimes alleged to stand for “Technology Without An Interesting Name,” the term is not actually an acronym despite being capitalized.

The name is well chosen because the TWAIN standard helped to solve the enduring problem of getting imaging devices and computers to connect and send data between one another. Most commonly used for scanners and digital cameras, TWAIN made it possible to request an image file to be imported into an application without having to utilize additional software or input commands using the physical device.

Implementing TWAIN Scanning with ImageGear

As a versatile image processing SDK, ImageGear fully supports the TWAIN specification, which allows developers to support any TWAIN-capable device directly into their applications. In most instances, this will involve adding a “Scan” button or option somewhere in the platform’s interface so that users can quickly and easily instruct their scanner to capture an image and pass it along to the application’s storage or workflow. Developers can also use the integration to adjust device settings directly from their application, such as changing the scanning area, modifying brightness and contrast, or increasing/decreasing dots-per-inch (resolution). 

ImageGear’s TWAIN scanning feature works with three external elements to facilitate image file transfers:

  • The Device: Usually a scanner or digital camera, this is the primary imaging source. The device must be compliant with TWAIN protocol.This is typically indicated by the manufacturer.
  • Data Source: Although ImageGear’s TWAIN scanning features can connect an application to a scanner, the device still needs a software driver that allows it to communicate with the computer’s operating system.
  • Data Source Manager: The TWAIN manager software provides a universal mechanism for managing and using data sources from different device manufacturers. Developed by the TWAIN consortium, it can be downloaded for free and installed wherever the application is running.

(Both the device’s data source driver and TWAIN data source manager should be included with its installation software. They are not provided by the ImageGear SDK).

Acquiring an Image Using TWAIN Scanning

ImageGear can configure an application to gather an image or set of images from a connected device with a few simple steps.

Step 1: Open the Data Source

Developers can set the application to automatically open a default Data Source. This is typically the best choice when only one scanner is available, as is often the case in a small workplace. They can also use the Data Source Manager to provide a list of all available Data Sources and let the user select the one they need.

Step 2: Adjust Settings

ImageGear’s TWAIN scanning features allow image acquisition parameters to be set through the application. Parameters such as page count and image size can be set to a common default, but developers can also give the option to obtain the various capabilities (listed as “ScanCaps”) and display them for users to select from. ImageGear supports a wide range of TWAIN-related capabilities.

Step 3: Acquire Image

After all settings are configured, the image can be scanned and loaded into an ImGearPage Class object. When acquiring a multi-page image, ImGearPages are loaded into an ImGearDocument Class object instead.

How ImageGear TWAIN Scanning Looks in Code

As an example, here’s what the C# code may look like when using ImageGear to help an application import an image from a TWAIN Data Source:

using System;
using ImageGear.Core;
using ImageGear.TWAIN;

public ImGearPage AcquireImage(IntPtr Handle)
{
    ImGearPage igPage = null;
    ImGearTWAIN igTWAIN = new ImGearTWAIN();

    igTWAIN.WindowHandle = Handle;
    igTWAIN.UseUI = true;

    try
    {
        // Open the data source selection dialog
        igTWAIN.OpenSource(String.Empty);

        // Initialize the scanning
        igPage = igTWAIN.AcquireToPage();
    } 

    catch(ImGearException e)
    {
        // Handle the exception ...
    }

    finally
    {
        if(igTWAIN.DataSourceManagerOpen == true)
        {
            igTWAIN.CloseSource();
        }
    }

    return igPage;
}

Expand Your Application’s TWAIN Support with ImageGear

Accusoft’s ImageGear SDK provides comprehensive support for a broad range of TWAIN devices, which makes it easier than ever for developers to control the scanning process directly from their applications. Integrating TWAIN scanning can streamline workflows and significantly improve the software user experience by completely eliminating the need to turn to external programs for image acquisition. ImageGear is fully compatible with multiple generations of the TWAIN standard, including TWAIN v1.6, v1.7, v1.8, v1.9, and v2.4.

In addition to TWAIN scanning support, ImageGear provides powerful image and document processing capabilities that can transform your application workflows. With extensive file conversion and compression features, it’s the best way to quickly integrate content management features into your platform. To get a glimpse of what ImageGear can do for your .NET application, download a free trial today and start building.

convert pdf

PDFs are everywhere. Vice calls them “the world’s most important file format,” and that’s not far off the mark. The sheer number of documents converted to, from, and often back to PDFs is astounding. The hard truth? They’re also frustrating to work with. Start a Google search with the word “convert” and three of the top five results involve PDFs. 

While this portable document format lives up to its namesake by making it easy for users to attach and send documents across their organizations, PDFs often run into problems when it comes to conversion, collaboration, and communication. While many tools offer piecemeal PDF functionality, they lack a complete cadre of critical capabilities, in turn forcing software engineers to use multiple software solutions for seemingly simple tasks. 

ImageGear offers a different take on the standard software development kit (SDK) designed to help developers maximize their PDF potential. Here’s how it works. 


The Value of PDF Conversion

While PDF conversion is one of the top sought-after functionalities, there’s another area that’s often overlooked: modifying the characteristics of PDFs on-screen. With companies now handling PDFs from multiple sources that may include everything from computer-generated form data to handwritten information and images, it’s no surprise that staff encounter a wide variety of viewing issues.

ImageGear PDF helps solve these problems by allowing users to call the shots on PDF content at scale with features such as:

  • Conversion
  • Metadata Management
  • Content and Font Editing
  • Text Extraction
  • PDF Watermarking
  • Container, Dictionary, and Layer Creation
  • 3D Asset Modification

ImageGear PDF also helps improve document processing with document cleanup and advanced optical character recognition (OCR). With the ability to encrypt and decrypt entire images (or part of an image), automatic ImageClean correction of white text blocks, borders, and inverted images, plus intelligent re-sizing, any PDF can be cleaned and made more readable for the user. 

OCR support for almost any document type is also a benefit. This includes those produced on typewriters, dot-matrix printers, ink-jet printers, laser printers, and photocopied, scanned, and faxed documents. ImageGear PDF helps users control and customize multiple PDF variables, making it a fully functional PDF conversion solution for your application.


PDF Pain Points

One of the biggest PDF frustrations? The inability to break apart and combine PDF documents. Let’s imagine you have a massive legal PDF or in-depth medical file. In these circumstances, professionals only need a portion of the PDF, but without the right tools they’re stuck sending entire files when all they need is a single page. In other cases, employees might have a host of related PDFs that are part of the same project, but can’t be easily combined to save space and time.

ImageGear PDF has you covered with the ability to easily delete or insert PDF pages, render pages in a single PDF, split a PDF, merge two or more PDFs into a single file, or even merge specific pages from two or more PDFs into a single PDF. This not only makes a massive difference in time spent working with PDF documents, it helps reduce unnecessary storage and transmission of multiple files. 


Convert PDF: Multiple File Formats for Conversion

Conversion is critical for PDF success. Instead of creating complexity by forcing end-users to stick with original file formats, implementing an SDK with cutting-edge conversion empowers corporate consistency and saves on storage space. ImageGear PDF supports a host of common file formats for conversion including Microsoft Office, JPEG 2000, CAD, and SVG.

Of course, no feature forward PDF framework is complete without robust annotation, redaction, and commenting capabilities. These features make it easy for other users to see exactly what’s been changed, when, and why, along with providing a critical, auditable paper trail to meet evolving compliance and regulatory standards.


PDF Functionality for Your Application

Best of all, ImageGear isn’t designed to replace your current software, but integrate alongside existing workflows. Rather than adding another application to already-overloaded IT arsenals, straightforward SDK integration means everything happens within your own application, making it easy for everyone to find exactly what they’re looking for within familiar territory. Need help jumpstarting your SDK deployment? Check out our full list of ImageGear .NET samples for ASP.NET, CAD, OCR support, and more.

PDFs remain eternally popular and continually frustrating. Solve for document viewing, split and merge, and conversion issues and streamline employee efforts with ImageGear.

OCR API Capabilities

The Accusoft engineering team is always exploring ways to improve PrizmDoc’s document processing capabilities. We regularly consult with our active customers to ensure that we’re focusing on features that will help them push the boundaries of innovation and deliver a better experience to end users.

That’s why we’re excited to talk about PrizmDoc’s new OCR API feature, which allows Independent Software Vendors (ISVs) to tap into the power of Accusoft’s industry-leading optical character recognition technology to enhance their application’s document processing capabilities.

Wait, What Is OCR Again?

Optical Character Recognition (OCR) is a technology that converts different types of documents, such as scanned paper documents, PDF files, or images captured by a digital camera, into editable and searchable data. At its core, OCR works by analyzing the graphical elements of a document and recognizing the patterns of characters or symbols present in it.

Initially, the OCR software segments the document into elements like lines or words and then further breaks them down into individual characters. Using machine learning and pattern recognition, it then matches these individual graphical components to their corresponding textual elements in a pre-defined character database. This process allows for the extraction of textual data from images, enabling digital storage and efficient searching, which facilitates streamlined management and utilization of information across various sectors.

Benefits of PrizmDoc’s OCR API

Building OCR features into an application is a time-consuming and expensive process. The technology behind OCR is not only quite sophisticated, but it also requires access to complex and evolving language libraries that allow it to identify text accurately. Obtaining the licenses for these libraries, incorporating them into a new OCR solution, and keeping them updated can be a challenge for developers who are unfamiliar with OCR processing.

With PrizmDoc’s OCR API, ISVs can easily incorporate OCR capabilities into their applications with a simple API call. We’re constantly updating our OCR features to add new languages and forms of character recognition, all of which can be rolled directly into software applications as part of the PrizmDoc API integration.

What Makes Accusoft’s OCR Different?

Accusoft has long been an innovator in processing solutions that incorporate OCR technology. Where many solutions offer only full-page recognition, our OCR products support zonal field recognition, which allows applications to focus on predefined form field types to extract key data like names, dates, emails, and identification numbers.

Zonal OCR significantly increases processing speed, allowing applications to extract data from documents more quickly. It also enhances accuracy since the OCR engine is only reading specific areas of the page instead of scanning the entire page.

Of course, if your application needs to OCR an entire page or document, our OCR technology is more than capable of doing so quickly and accurately. We support multiple Western and Eastern languages, including Central European, Cyrillic, Baltic, and Asian characters. You can even set confidence levels for recognition results to incorporate manual reviews into your document process.

Industry Applications of OCR Technology

Fintech Applications

By integrating OCR technology into Fintech applications, financial institutions can automate the extraction of data from physical or digital documents, such as invoices, contracts, and bank statements, eliminating manual entry and reducing errors. This not only saves time but also enhances accuracy and efficiency, facilitating quicker decision-making processes. It can also aid in compliance and auditing tasks by easily retrieving information from a vast array of documents. By incorporating OCR APIs, Fintech applications can significantly enhance the finance industry’s service quality, fostering a more data-driven and customer-centric approach.

Legaltech

When integrated into a Legaltech application, lawyers, paralegals, and other professionals can utilize OCR technology to swiftly convert scanned documents, agreements, and legal briefs into searchable text. This can significantly expedite research and case preparation, allowing legal practitioners to efficiently sift through large volumes of text to locate pertinent information. It also enables the creation of digital databases that can be easily navigated and organized, enhancing the retrieval of case-related documents and fostering a more streamlined approach to legal work, thereby saving time and resources.

Insurtech

For ISVs building solutions to support insurance companies, an OCR API can serve as a pivotal tool in modernizing and streamlining the processing of numerous document types, including claims, policies, and supporting paperwork. It facilitates the quick conversion of scanned documents and images to searchable text formats, which can automate data extraction and reduce manual handling, minimizing the risk of errors and expediting claim processing times. By automating a significant portion of administrative tasks, insurance companies can focus more on developing customer-centric strategies and solutions, fostering greater efficiency and effectiveness within the industry.

Govtech

Governments handle a vast array of documents – from forms and applications to historical records. By implementing OCR technology into a Govtech application, governmental agencies can automate the data extraction process, thereby drastically reducing manual labor and minimizing errors. This makes the archival and retrieval of documents more efficient, fostering transparency and ease of access to public records. Furthermore, OCR can aid in analyzing data from various documents to formulate better policies and decisions based on historical and current data trends. Ultimately, integrating an OCR API can pave the way for more streamlined, cost-effective, and citizen-friendly governmental operations, promoting inclusivity and digital literacy.

Expand Your Application’s Potential with PrizmDoc OCR API

Incorporating advanced OCR capabilities into your application is easier than ever with the release of PrizmDoc’s OCR API feature. To learn more about how you can quickly add full-page and zonal character recognition that supports multiple languages, talk to one of our PrizmDoc experts today.

Automated data capture tools are an essential feature of today’s business applications. Without the ability to quickly extract information from incoming forms and documents, organizations will struggle to keep their records, databases, and customer-facing software up-to-date. While software SDKs like Accusoft’s SmartZone can deliver powerful optical character recognition (OCR) and intelligent character recognition (ICR) to help applications accurately capture the information they need, these tools were not designed to operate in isolation. To get the best performance out of them, they need to be incorporated into a comprehensive and well-designed forms processing workflow.

Building an Efficient and Effective Forms Processing Workflow

Although data capture is often the primary objective of forms processing, a number of elements need to be in place for an application to be able to deploy SmartZone’s powerful OCR/ICR functionality. The first step involves the creation of form templates that can be used both for identifying incoming scanned forms and for defining field regions on the page from which data can be extracted. Building this library of templates provides a road map of sorts for the recognition process.

After form images are acquired, either from pre-existing digital documents or newly scanned images, they may need to be enhanced or cleaned up to ensure the best recognition results. Operations such as binarization, despeckling, deskewing, and line removal can all improve the data capture process, especially in the case of scanned documents. Older documents frequently include a great deal of image noise when scanned into digital format, which can make it difficult for an OCR/ICR engine to properly segment and read characters cleanly.

Once a form image has undergone enhancement, it can be matched and aligned with the correct template to ensure that the SmartZone recognition engine will be able to obtain a clean field clip. Scanned images can be overlaid via an alignment algorithm that performs minor adjustments to match it exactly with the correct template. This step is crucial because the data capture process is set up to read the field areas identified by the template rather than recalibrating for each form. If the alignment is off, the engine will not get a clean read of the characters, which could result in inaccurate recognition results.

After the form is identified and aligned, additional enhancement and cleanup operations can be performed on the specific areas of the form that contain information to be extracted. This typically means individual field areas where text or other characters have been entered. The locations to be cleaned up can be designated during the template creation process when data extraction zones are defined. In some instances, a processing workflow may skip the initial full-page enhancement and instead only perform clean-up on areas where data capture will be carried out. This approach is often more efficient from a processing standpoint, especially when targeted, zonal recognition is being applied.

Form image dropout can also be performed at this stage, which involves the removal of image content like signature lines, text field boxes, comb lines, or other extraneous guiding content. Here again, proper form alignment is crucial. If the form is slightly “off” from the template, valuable character content could be removed, making accurate recognition much more difficult. Good form dropout tools should also be able to reconstruct characters that lose pixel data during the dropout process, which is common for characters that have an element that overlaps form lines (such as the lower half of a “j” or a “y,” which might otherwise be read as an “i” or a “v” if not repaired prior to recognition).

SmartZone’s Role in the Recognition Phase of Application Workflows

After a form is acquired, enhanced, identified, and aligned, it can be passed along to the next stage of the workflow for text recognition using SmartZone OCR/ICR. There are a few options that can be selected at this point to help improve recognition accuracy and faster data capture performance.

1. Select Character Sets

SmartZone supports a wide variety of languages and alphanumeric character sets. Realistically, only a few of these sets will need to be used at any one time. Selecting only the sets needed for a particular form will improve recognition accuracy and speed. For instance, there’s no need to have support of Cyrillic languages (like Russian or Greek) enabled if all of the forms being processed are in English.

2. Designate Field Types

SmartZone can designate the expected format of text found in specific fields on a template. Rather than reading each field out of context and extracting the contents without knowing whether or not it’s been filled in correctly, field types can be set to values such as date, email, currency, phone number, or Social Security Number. Regular expressions can also be established for more customizable results. If the character content of the field doesn’t match the designated field type, SmartZone will immediately return an exception and move on rather than trying to recognize and extract the incorrect data. Setting this parameter can greatly improve both accuracy and speed.

3. Set Minimum Character Confidence

Every character SmartZone reads is assigned a confidence value, which reflects the OCR/ICR engine’s assessment of its recognition accuracy. A lower value means that there is a higher likelihood that a character was incorrectly identified. Setting a minimum character confidence value ensures that any character result below that value will be rejected and replaced with a designated rejection character. In practice, this control is used to determine which characters require a manual review following recognition. Setting a high confidence value will ensure higher recognition accuracy, but will likely lead to more exceptions that need to be reviewed by a human.

SmartZone Recognition Results

After character recognition is performed, results can be returned for the character, text line, or text block level. This data can then be passed along to the next stage of a business workflow or used to populate databases connected to the application. Operation instructions, identification, and image areas defined can be transferred to other components for additional forms processing or stored in memory for later access using SmartZone’s Read From Stream or Write From Stream functions.

Getting Started with SmartZone

With support for both OCR and ICR data capture, Accusoft’s SmartZone SDK can serve a vital role in high-performance forms processing applications. The powerful OCR engine can recognize multiple languages, including select Asian, African, and Indian characters. Capable of performing full page or zonal text extraction, SmartZone also includes a variety of customization features that can improve accuracy and recognition speed. Learn more about this versatile SDK’s features and use cases in our product fact sheet.

Gain Peace of Mind with GDPR Compliant Document Viewing Tool for Secure Collaboration

These days there is a heightened awareness of the risk of opportunity for a data breach or cyber attack. Whether the spike in attention came from a global pandemic, brink of international war, or an unknown hacker that set its sights on Elon Musk, there is a general consensus that our personal data is at risk at any point through a breach of security.  This becomes more potent for companies as the cost of such insecurity could potentially end its tenure.  According to an IBM Data Breach Report, 2021 had the highest average data breach cost in a 17-year history of $4.24M. Securing data and maintaining an individual’s privacy is a priority for many organizations throughout the world, but following a strict standard has only been attempted by the European Union (EU) thus far.  

The EU has taken this priority a step further than just suggesting companies and organizations increase data protection – since 2018, they’ve mandated and enforced specific requirements through the General Data Protection Regulation (GDPR). Given the broad parameters covered under GDPR compliance, ensuring the standards are met can become a time-consuming, stressful and ongoing issue if not resourced properly.  The parameters go beyond the protection of personal data and go as far as requirements, to prove security measures are set in place.  

Who Needs to Maintain GDPR Compliance?

While often only associated within the European Union, the requirements and legislation of the GDPR extend as far as all “entities who are offering goods or services to anyone residing in the EU (even if those services are provided free of cost).  Any global business either has to become compliant for all of its users/customers or be able to accurately identify EU residents and enable compliant systems to handle only that subset of the customer base.”

GDPR requires companies to know the following as related to personal data:

  • What personal data is being shared 
  • Where it is being shared 
  • How it can be deleted at a moment’s notice if necessary

The GDPR also highly encourages that an organization designates an employee to be the point of contact and in control of the data security processes and systems to maintain compliance.  A first step to having an effective process in place is choosing the right tools with security features to protect data being shared within the company.

Managing Risk through Secure Document Viewing

As risk management becomes an essential part of strategic planning, the importance of IT security and data encryption skyrockets to the top of priorities for most companies. GDPR suggests encryption as a means to manage risk in file sharing but does not outline explicit instruction.  With PrizmDoc™ Viewer, companies gain added data security, aligned with GDPR compliance, in document viewing and sharing without heavy client-side installations or downloads. 

PrizmDoc™ Viewer is created with Multi-Level Data Protection including:

  • 256-bit AES encryption 
    • (Advanced Encryption Standard) is an international standard that ensures data is encrypted/decrypted following this approved standard. It ensures high security and is adopted by the U.S. government and other intelligence organizations across the world.
  • Configurable user permissions add a strong measure of privacy and protection to document content.

A Simple Path to Secure Document Sharing

Remote work or not – collaborating on a project today means sharing documents among many colleagues to finalize a document, project, or presentation. To do that with security in mind, organizations are cobbling together tech stacks to meet their productivity needs along the way, and several different file types can come across their desks in a single day.  

PrizmDoc™ Viewer integrates into your current application to render and display a multitude of file types with high fidelity and speed.  The ease of use features include:

  • Flexible use across many platforms
  • A self-hosted version that resides on any organization’s servers
  • Empowers developers to provide their users with responsive file viewing
  • Search and redaction can be easily turned on/off

PrizmDoc Viewer is also designed to run on all devices with a zero-footprint viewer that makes it easy for employees to work where and how they wish. The white label services give an organization the flexibility to brand and customize while gaining peace of mind in data security.

Open and View an Image Securely the First Time

While documents have a range of formats from Word, PDF, spreadsheets, and more – images are often more of a culprit when it comes to difficulty viewing, let alone being able to download, edit, markup, or save information as a separate file. Workers find themselves quickly downloading a media player just to open the image.  Having multiple solutions in place is not only confusing, but it also contributes to inefficiency and human error which means added risk for images to remain secure. 

As photographs can constitute personal data under the GDPR, this means organizations must be able to quickly and easily remove all images where the individual can be identified.  

With ImageGear, an organization is able to add powerful image processing capabilities that enhance secure collaboration such as:

  • PDF manipulation that includes managing access with digital signatures for added security levels
  • The image processing library offers developers a set of methods for modifying an image including to resize, crop, merge, rotate, and flip.
  • An option to add OCR for document search and data capture support

Getting Started 

To quickly gain peace of mind with secure collaboration, contact us today

OCR segmentation

Today’s high-speed forms processing workflows depend on accurate character recognition to capture data from document images. Rather than manually reviewing forms and entering data by hand, optical character recognition (OCR) and intelligent character recognition (ICR) allow developers to automate the data capture process while also cutting down on human error. Thanks to OCR segmentation, these tools are able to read a wide range of character types to keep forms workflows moving efficiently.

Recognizing Fonts

Deploying OCR to capture data is a complex undertaking due to the immense diversity of fonts in use. Modern character recognition software focuses on identifying the pixel patterns associated with specific characters rather than matching characters to existing libraries. This gives them the flexibility needed to discern multiple font types, but problems can still arise due to spacing issues that make it difficult to tell where one character ends and another begins.

Fonts generally come in one of two forms that impact how much space each character occupies. “Fixed” or “monospaced” fonts are uniformly spaced so that every character takes up the exact same amount of space on the page. While not quite as popular now in the era of word processing software and digital printing, fixed fonts were once the standard form of typeface due to the technical limitations of printing presses and typewriters. On a traditional typewriter, for example, characters were evenly spaced because each typebar (or striker) was a standardized size.

From an OCR standpoint, fixed fonts are easier to read because they can be neatly segmented. Each segmented character is the same size, no matter what letters, numbers, or symbols are used. In the example below, the amount of space occupied by the characters is determined by the number of characters used, not the shape of the characters themselves. This makes it easy to break the text down into a segmented grid for accurate recognition.

OCR segmentation:  Monospace Font Example

“Proportional” fonts, however, are not uniformly spaced. The amount of space taken up by each character is determined by the shape of the character itself. So while a w takes up the same space as an i in a fixed font, it takes up much more space in a proportional font.

OCR segmentation:  Fixed versus proportional font

The inherent characteristics of proportional fonts makes them more difficult to segment cleanly. Since each character occupies a variable amount of space, each segmentation box needs to be a different shape. In the example below, applying a standardized segmentation grid to the text would fail to cleanly separate individual characters, even though both lines feature the exact same character count.

Proportional Font Example

Yet another font challenge comes from “kerning,” which reduces the space between certain characters to allow them to overlap. Frequently used in printing, kerning makes for an aesthetically pleasing font, but it can create serious headaches for OCR data capture because many characters don’t separate cleanly. In the example below, small portions of the W and the A overlap, which could create confusion for an OCR engine as it analyzes pixel data. While the overlap is very slight in this example, many fonts feature far more extreme kerning.

Example of Kerning

In order to get a clean reading of printed text for more accurate recognition results, OCR engines like the one built into Accusoft’s SmartZone SDK utilize segmentation to take an image and split it into several smaller images before applying recognition. This allows the engine to isolate characters from one another to get a clean reading without any stray pixels that could impact recognition results.

Much of this process is handled automatically by the software. SmartZone, for instance, has OCR segmentation settings and properties that are handled internally based on the image at hand. In some cases, however, those controls may need to be adjusted manually to ensure the highest level of accuracy. If a specific font routinely returns failed or low confidence recognition results, it may be necessary to use the OCR segmentation properties to adjust for font characteristics like spaces, overlaps (kerning), and blob size (which distinguishes which pixels are classified as noise).

Applying ICR Segmentation

All of the challenges associated with cleanly segmenting printed text are magnified when it comes to hand printed text. Characters are rarely spaced or even shaped consistently, especially when they’re drawn without the guidance of comb lines that provide clear separation for the person completing a form.

Since ICR engines read characters as individual glyphs, they can become confused if overlapping characters are interpreted as a single glyph. In the example below, there is a slight overlap between the A and the c, while the cross elements of the f and t are merged to form the impression of a single character.

ICR Segmentation Properties

SmartZone’s ICR segmentation properties can be used to pull apart overlapping characters and split merged characters for more accurate recognition results. This is also important for maintaining a consistent character count. If the ICR engine isn’t accounting for overlapped and merged characters, it could return fewer character results than are actually present in the image.

Enhance Your Data Forms Capture with SmartZone

Accusoft’s SmartZone SDK supports both zonal and full page OCR/ICR for forms processing workflows to quickly and accurately capture information from document images. When incorporated into a forms workflow and integrated with identification and alignment tools like the ones found in FormSuite, users can streamline data capture and processing by extracting text and routing it to the appropriate databases or application tools. SmartZone’s OCR supports 77 distinct languages from around the world, including a variety of Asian and Cyrillic characters. For a hands-on look at how SmartZone can enhance your data capture workflow, download a free trial today.

 

digital banking

Banks are in no rush to bring workers back. While some had early plans to restart in-office work, the Wall Street Journal notes that even as Manhattan rushed to restart its physical financial framework, few staff have made the move. Meanwhile, financial firms like JP Morgan are putting return to work strategies on hold indefinitely as pandemic priorities evolve. 

The result is a realization that to generate revenue, firms must embrace digital banking initiatives, with no remote work roadmap that exists. This transition means going beyond simply sending staff home. It means creating a financial framework that addresses key challenges, acknowledges current trends, and embraces the next, new normal of digital banking transformation. 

Digital Banking Challenges: The Stay-at-Home Shift

As noted by The Financial Brand, the COVID-19 pandemic has accelerated the urgency for digital banking transformation. But it’s one thing to recognize the gap between current outcomes and new expectations — it’s something else to apply solutions at scale.

Here, it’s critical for banks to avoid the knee-jerk reactions that often come with operational urgency and instead start with a focus on what’s working, what isn’t, and what needs to change. The old “if it’s not broken, don’t fix it” adage applies here; spending on solutions that don’t solve specific problems will only widen the gap between pandemic problems and corporate performance.

To embrace the stay-at-home shift, banks must consider three key challenges:

  • CommunicationNearly 70 percent of professionals say that the current pandemic has been the most stressful time of their career. Not only are staff worried about potential health problems, but they’re also concerned with juggling jobs and families simultaneously with little assurance of security. As a result, communication is critical. For banks, this includes regular team check-ins and staff meetings but also one-on-one conversations that aren’t about performance or productivity but instead prioritize mental health.
  • CollaborationWhile new video conferencing tools have empowered virtual face-to-face communication, they don’t always deliver workflow collaboration. Teams now need technology that empowers them to work together on loan processing, credit applications, and investment analysis at scale.
  • CompletionThere are so many tasks that are left in limbo due to paper processes. A form could be sitting on someone’s desk or in their email inbox for weeks before processing takes place. As result, applications get stalled and consumers have to wait. Banks need workflow automation tools that ensure critical tasks aren’t waiting for completion.

Digital Banking Trends: Mid-Pandemic Priorities

As firms respond to evolving client, stakeholder, and even regulatory expectations, it’s critical for firms to realize where digital banking trends are headed and what that means for their bottom line. As noted by Finextra, this starts with the digital banking experience. Research from McKinsey shows that customers who are satisfied with their current digital experience are 2.5 times more likely to open new accounts with their existing bank. This makes digital experience the new banking battlefield. If firms can meet (or exceed) consumer expectations around ease-of-use and data security, they can set the pace of pandemic performance rather than falling behind.

Banks must also embrace moving away from service-based applications to those that actively drive engagement. While transactional apps — such as those that allow customers to check their bank balance or perform simple payments and transfers — are now par for the course, clients who don’t feel comfortable visiting branches in person are now looking for customized and personalized digital banking experiences. This includes everything from the ability to easily connect with financial advisors to comprehensive investing and saving advice based on both historical data and likely outcomes.

For financial firms, tackling new trends requires the right IT framework. This means building out existing infrastructure to support everything from increased informational throughput to in-depth data analysis. In a world where digital client satisfaction can make-or-break financial futures, pre-pandemic platforms simply aren’t enough.

Digital Banking Transformation: The Next, New Normal

With return-to-office plans in limbo, some banks are now taking the next logical step and offering permanent work-from-home options, but as noted by Forbes, there’s a problem. Most banks still aren’t doing enough to embrace digital transformation at scale. 

When asked, 79 percent of business leaders defined digital transformation as the “integration of digital technologies into all areas, fundamentally changing how to operate and deliver value, and a culture change that continually challenges the status quo and gets comfortable with failure.” But despite the widespread impact of current COVID concerns, many banks remain on a digital path that prioritizes incremental change, not complete transformation. Backed by legacy tools and aging apps, however, simply adding small services to existing stacks won’t be enough to support the next, new normal of stay-at-home staffing. 

To drive meaningful, substantive change across organizational operations, banks must prioritize three transformative functions:

  • Document Management Firms are suddenly dealing with a deluge of document formats and file types that must be handled by geographically disparate staff. Time spent searching for conversion, annotation, redaction, and editing tools is wastes time. Agile, adaptable document management tools that deliver end-to-end capabilities are now critical.
  • Solution SecurityBanks must comply with regulations that mandate consumer data security and process compliance. FinTech applications must provide secure ways for departments to collaborate on sensitive documents while also maintaining security and abiding by industry regulations. By integrating a document viewer inside the application itself, financial institutions are able to programmatically restrict downloading of sensitive documents.
  • Trackable CollaborationStaff need the ability to quickly locate and remedy process problems. This is especially critical as the volume of digital documents ramps up over time. Bank employees must be able to find, fix, and finish tasks efficiently. 

A New, Flexible Roadmap for Digital Banking

While there’s no perfect roadmap for digital banking transformation in the age of COVID-19, however, the first step is obvious. Embrace the realities of work-from-home. Many banks are distracted with incremental change and stuck in pre-pandemic thought processes, hoping the pandemic will end and things will go back to normal. As with every major world event, the world is going to be different after COVID. 

Banks must prepare for this change and  embrace true evolution. Banks must start by articulating the challenges of remote work, acknowledging the evolving expectations of mid-pandemic trends, and addressing the need for transformative technological change.