Technical FAQs

Question

Is there a way to modify the colors used in PrizmDoc Viewer’s search results? In particular, the currently selected search result has a very similar color to the other results from the same term. Is there a way to increase the contrast?

Answer

Currently, we don’t support the ability to set anything other than the highlight color through the search API.

There is a feature request to enable such modifications:

https://ideas.accusoft.com/ideas/PDV-I-531

One of the more challenging aspects of developing SDKs with machine learning models is deployment and productionization. TensorFlow in particular can be difficult to set up, and requires GPUs to evaluate large models. This post will share my experiences in skirting this process entirely to quickly evaluate a FasterRCNN-based model during a hackathon last year, usable on any office or CI machine.

During this hackathon, I implemented and trained a model from a paper from ICDAR 2017 on one of our physical machine learning-equipped machines. To achieve quick deliverables, rather than try to get the trained model and data off the machine, I simply used a tool called Luminoth running on the machine to expose the model’s prediction functionality. This also allowed anybody on my team to continue developing the model afterward with minimal friction, and required only a small networking shim in our codebase.

Luminoth is a Python-based tool that I like to refer to as “a command line wrapper around TensorFlow.” While the use of a YAML file to quickly set up and train some popular networks such as FasterRCNN is its main use, it also exposes a Flask-based server which allows prediction queries via a web page. As it turns out, it also exposes an (undocumented) API which is usable programmatically.

My codebase is in C++ with a C# assembly wrapping it. That being the case, I had to get my model’s predictions (a number of bounding boxes) into C++ code, and fast. Figuring out TensorFlow’s shaky C++ API (or even using Python-based TensorFlow) wasn’t an option. The model was already trained on our machine-learning computer, and would have required a large setup cost and data duplication by anyone else evaluating the model. I had my eye on a particular C++ networking library, CPR, that I have been meaning to use; so I thought, why not tackle all of these problems at once?

Let’s start by figuring out Luminoth’s API from the source and web page itself.

First, using Lunimoth’s server as per the documentation shows requests being made to an endpoint named `api/fastercnn/predict`. We can see it’s returning some JSON–great, we now know it’s probably possible to invoke programmatically!

screenshot

Digging in Luminoth’s web.py, around line 31 at the time of writing, the corresponding endpoint `/api//predict/` method is our ticket.
The first thing we see is an attempt to retrieve the image data from the request to predict:

try:
  image_array = get_image()
except ValueError:
  return jsonify(error='Missing image'), 400
except OSError:
  return jsonify(error='Incompatible file type'), 400

What is get_image() ? Well, it shows an expectation of a POST’ed file by the name of ‘image’.

def get_image():
  image = request.files.get('image')
  if not image:
          raise ValueError
          image = Image.open(image.stream).convert('RGB')
  return image

This is a Flask web server. The Flask documentation for the files property in the Request object shows that this only appears (for our purposes) in a POST request, with a <form> object, and when an encoding of enctype=”multipart/form-data” is given. Right, sounds like we now know how to use the endpoint programmatically. Now, how can we call this from C++ using CPR?

Let’s start with the POST request. Using CPR, this is very straightforward. The required multipart/form-data encoding is handled by the cpr::Multipart object. At the time of writing, there is a bug with that and data buffers; so in order to proceed with the hackathon, the image was first written to a file, reloaded, and then sent. Don’t do that if possible.

extern "C" __declspec(dllexport) void* SendImagePostRequest(const char* url, unsigned char* data, int data_size)
{
    std::string filename = WriteTemporaryFile(data, data_size);
    auto response = cpr::Post(
            cpr::Url{ url },
            cpr::Multipart{{ "image", cpr::File{ filename } }});
    std::remove(filename.c_str());
    return ParseJsonResults(response.text);
}

Where url is the URL of the Luminoth endpoint we found, and data and data_size are the image we are trying to use FasterRCNN to predict. When used, it looks like this:

void* resultsHandle = predictTables("http://beast-pc:5000/api/fasterrcnn/predict/", image.data(), (int)image.size());

The POST request returns a JSON string. We need to decode it. Luckily, there is superb header-only Json library, Nlohmann Json (which I think has the potential to be part of the C++ STL; by all means use it), we can drop right in and get a vector of RECTs and their confidences back:

static std::vector* ParseJsonResults(const std::string& response)
{
    auto json = json::parse(response);
     std::vector* results = new std::vector();
     for (const auto& object : json["objects"])
    {
            const auto& bbox = object["bbox"];
            float confidence = object["prob"];
            results->emplace_back(RECT { bbox[0], bbox[1], bbox[2], bbox[3] }, confidence);
    }
     return results;
}

Note that the boxes are returned in a X/Y/Right/Bottom format. If you need a X/Y/Width/Height format, it’s easily convertible. From then on, the bounding boxes can be passed on throughout the codebase, and improvements of the method over current methods can be measured.

You’ll have to excuse the use of void pointers, pointers to vector, new, and other frowned-upon items. The use of CPR also required an additional problem here. The C++ codebase is in MSVC 11.0, and CPR requires MSVC 14.0 or later. To integrate this, a separate DLL was created and loaded dynamically via LoadLibrary in the main source, so a C API was created. But these are implementation details. And again, it was simply the quickest way to get results.

That’s about it for this post. All-in-all, I believe Luminoth is an underrated, but also unfinished, machine learning tool. It’s a good choice for having a quick way to train, save state, and evaluate neural networks. The API allows high-speed integration of a model into existing code in any language, after which a results analysis can determine if to further productionize the model or not.

It’s a business battlefield out there. Not one of munitions and machines, but time and resources. Companies are struggling to provide end-users and consumers with the content they need, when they need it, without breaking the bank. Document management now helps companies make progress without losing productivity.

As noted by the SocioHerald, document management solutions are “booming worldwide” and on track for significant growth over the next five years, but as data volumes increase and connectivity allows simple sharing of more complex and media-rich content, large documents pose a new challenge. How do organizations deliver high-volume content quickly and accurately to drive on-demand end-user interaction?

Accusoft’s PrizmDoc Viewer can help deliver peace of mind — and win the large document loading war — with dual-pronged delivery of document pre-conversion and server-side search.

The Need for Speed

As noted by Forbes, one second is now the “magic number” when it comes to loading webpages — any slower and potential consumers begin to abandon ship. Welcome to the future.

Employees are now used to this kind of rapid retrieval when they search for data online, so they bring these same expectations into the office when it comes to document loading and access times. What does this mean in practice? Both user satisfaction and overall productivity suffers when documents don’t load fast enough.

So how do companies get to the finish line faster? Start with document pre-conversion. PrizmDoc Viewer contains a pre-conversion API that allows companies to create viewing packages for large documents using POST requests and JSON formatted source objects. Combined with the PAS layer of PrizmDoc server, this pre-conversion feature allows massive documents — such as Tolstoy’s 1493-page epic War and Peace — to load in just 0.69 seconds.

The caveat? Pre-conversion isn’t enough in isolation. To ensure users find what they’re looking for, and fast, organizations also need the benefit of server-side search.

Search and Rescue

Eighty percent of Americans now experience some type of “tech frustration” every day. Spotty connections and smartphone failures top the list, but documents also make the cut. Client-side searches within large documents can put a strain on a browser-based document viewers’ memory load. The best case scenario? Massive load times that frustrate staff efforts. Worst case? Complete viewer crashing as the browser overloads.

There’s a better way. With PrizmDoc Viewer’s server-side search feature, you can offload search work to the server, significantly reducing the strain on client-side viewer code. Using PrizmDoc’s Viewer configuration options, developers can also create custom server-side search parameters to reduce the strain on memory-capped browsers or more easily access text-heavy documents. Put simply? Server-side search can help rescue document retrieval speeds and reduce user frustration.

Document Detente

Slow-loading, large documents can ramp up hostilities between staff trying to get their work done and the tech initiatives that supposedly boost productivity. Fortunately, there are ways to reduce loading times and achieve document detente with PrizmDoc Viewer. Accusoft’s pre-conversion APIs and customizable server-side search parameters make this tech treaty even easier to achieve with straightforward in-app integration, providing complete functionality under the banner of in-house applications.

Ready to ramp up productivity and win the war on large document loading? See server-side speed in action with the server-side search demo or enlist the in-app advantage with a free trial today!

ocr optical character recognition

Effective document management is now a top priority for organizations, but for many, it remains a challenge. As noted by recent AIIM survey data, companies are struggling to handle both the documents they have and the rapid uptake of new information. In fact, 43 percent said their biggest priority is effectively leveraging the structured and unstructured content they already have, while 57 percent are focused on understanding the overwhelming big data.  Optical character recognition (OCR) is a critical component of document management.

For software development firms, this poses a particular challenge. Products are no longer feature complete without critical end-user functions such as advanced optical character recognition and powerful search. However, adding this functionality is not as easy as it sounds. Developers building out this comprehensive construct from the ground up requires both time, effort, and continued maintenance, which is a large undertaking for any company.

Accusoft’s ImageGear SDK offers a way to bridge the OCR gap with comprehensive image processing and manipulation capabilities that both streamline software development and deliver on end-user expectations.*


What is ImageGear?

ImageGear easily integrates into existing applications to deliver cutting-edge document management functionality at scale. Available for both .NET and C/C++ frameworks, ImageGear allows developers to quickly deploy and white-label key features including image processing, manipulation, conversion, and PDF and document search.

This add-on OCR functionality delivers highly-accurate optical character recognition to any .NET (C#) or C/C++ application. ImageGear’s OCR add-on provides full-page character recognition for more than 100 languages — including both Western and Asian languages such as Korean, Japanese, and Chinese character sets. It’s capable of recognizing multiple languages within a single image for enhanced document management. Other OCR features include:

  • Automatic page segmentation into individual zones for processing
  • Type assignment per zone based on defined flows, tables, or graphics
  • Table detection with advanced technology to enhance data reconstruction output
  • Entire page or individual region image processing
  • Zone definition by user, existing files, or detected automatically by the OCR engine

In addition, software developers can enhance ImageGear OCR functionality by leveraging both predefined and customizable dictionaries to ensure validated results using regular expressions. 


Why Optical Character Recognition (OCR) Matters to End-Users

Advanced OCR integration makes it easier for end-users to find what they’re looking for, when they’re looking for it. Instead of forcing users to find additional apps that deliver specific services, in-app OCR delivers increased satisfaction by streamlining user search functionality.

Common use cases include:

  • Legal eDiscoveryThe eDiscovery process is a critical — and often complex — stage of legal case preparation. Firms need to quickly find key terms, phrases, and images within legal documents to ensure they meet both client expectations and compliance obligations. With many forms now scanned and stored in non-standard file formats that contain form fields, text boxes, and digital imagery, OCR is essential to help lawyers streamline the process of eDiscovery at scale.

 

  • Financial Document ProcessingClients now expect loan applications and credit card applications to be processed at scale and speed. This is especially critical as firms embrace the idea of remote work — both staff at home and those in the office need end-to-end OCR functionality to deliver complete document management.

 

  • Insurance Documentation Assessment Insurance claims are both complex and comprehensive, requiring complete documentation from clients, contractors, and compliance agencies. As insurance firms move to tech-first frameworks to enhance document processing, speed, and accuracy, OCR makes it easy for staff to find specific data and ensure documentation is complete. 

Integrating OCR

Advanced OCR functionality won’t deliver expected outcomes if integration is cumbersome and complex. ImageGear streamlines this process with easy SDK implementation for both .NET and C/C++.

ImageGear .NET can be easily deployed on multiple platforms. These .NET deployments include ASP.NET functions such as image display, thumbnail display, annotation support, and cloud capture along with WPF printing and annotation support. ImageGear for C/C++, meanwhile, offers support for several platforms as well. Check out the developer resources section to see an updated list.


How Your Clients Use Optical Character Recognition (OCR)

PDFs remain the go-to file format for many industries, offering both standardized image and text conversion along with the ability to easily set or restrict document permissions. The problem? PDFs are notoriously difficult to search, making it hard for end-users to quickly find the text or data they need.

ImageGear makes it easy to OCR PDFs using the ImGearRecPage.Recognize Method, which leverages the zone list of the image to deliver accurate OCR — or, if this list is empty, automatically calls the page-layout decomposition process (auto-zoning) to complete the OCR process.

C# supports OCR to PDF.


using System.IO;
using ImageGear.Core;
using ImageGear.Formats;
using ImageGear.Evaluation;
using ImageGear.Recognition;

namespace ImageGearTest
{
    class Program
    {
        static void Main(string[] args)
        {
            // Initialize evaluation license.
            ImGearEvaluationManager.Initialize();
            ImGearEvaluationManager.Mode = ImGearEvaluationMode.Watermark;

            // Initialize the Recognition Engine.
            ImGearRecognition igRecognition = new ImGearRecognition();

            // ImageGear assemblies require explicit initialization at application startup.
            ImGearCommonFormats.Initialize();

            // Open a FileStream for our output document.
            using (FileStream outputStream = new FileStream(@"c:\temp\outputDoc.txt", FileMode.OpenOrCreate, FileAccess.ReadWrite))
            {
                // Open a FileStream for our source multi-page image.
                using (FileStream multiPageDocument = new FileStream(@"c:\temp\test.tif", FileMode.Open))
                {

                    // Load every page of the multi-page document. Starting at page 0 and loading the range of spaces specified.    
                    // Since the range is -1, that specifies that all pages shall be loaded.     
                    ImGearDocument doc = ImGearFileFormats.LoadDocument(multiPageDocument, 0, -1);

                    // Determine the amount of pages in the multi-page image.
                    int numPages = ImGearFileFormats.GetPageCount(multiPageDocument, ImGearFormats.UNKNOWN);

                    // Recognize each page of the multi-page document and add the results to outputStream.
                    for (int pageNumber = 0; pageNumber < numPages; pageNumber++)
                    {

                        // Cast the current page to a raster page and import that page.
                        using (ImGearRecPage igRecPage = igRecognition.ImportPage((ImGearRasterPage)doc.Pages[pageNumber]))
                        {

                            // Preprocess the page.
                            igRecPage.Image.Preprocess();

                            // Perform recognition.
                            igRecPage.Recognize();

                            // Add OCR results to the outputStream.
                            igRecognition.OutputManager.WriteDirectText(igRecPage, outputStream);

                        }
                    }
                }

            }
            // Dispose of objects we are no longer using.
            igRecognition.Dispose();
        }
    }
}

 


OCR Access and Analysis

Advanced OCR isn’t enough in isolation — developers must also empower end-users to quickly access and analyze OCR output. ImageGear offers multiple options to help streamline this process, such as:

  • Storage of Output as Code Pages
  • Export to Text Format
  • Export to PDF
  • Export to MRC PDF
  • Export to a Formatted Document

Find Your Best Fit

ImageGear OCR makes it easy for end-users to quickly search critical documents, find the data they need, and analyze optical character recognition output, but don’t take our word for it. Seeing is believing. Test ImageGear in your own environment and discover the difference of advanced OCR. 

*Optical character recognition is an ImageGear add-on and must be requested upon purchase of a license.

 

An HTML5 document viewer is a document and image viewing solution that allows users to easily view and collaborate on multiple file types from any desktop or mobile device within a browser. HTML5 document viewers make this possible by converting files from native formats into desired outputs, then presenting the resulting content in a browser using standard HTML5 markup.

Single-interface interaction is the biggest draw of HTML5 viewers. Since everything happens in-browser, they’re ideal for application integration. As a result, they’re often deployed by value-added resellers and software manufacturers to develop specific solutions for client needs. They’re also leveraged by enterprises looking to add functionality to existing document management solutions (DMS) or content management solutions (CMS) that both reduce total costs and improve overall efficiency.

Software developers can embed an HTML5 document viewer into existing websites or applications without the need to create this functionality from the ground up or compromising user security. On the other hand, end-users get the responsive, dynamic viewing experience they want without the need to download plugins or open other applications.


What are the benefits of an HTML5 document viewer?

Accusoft’s PrizmDoc Viewer offers industry-leading HTML5 functionality, making it the viewer of choice for developers, integrators, and system administrators looking to enhance document viewing, collaboration, and security without increasing their workload.

PrizmDoc Viewer makes this possible by using a collection of REST APIs that support more than 100 file types. It also includes built-in support for conversion, OCR, annotation, redaction, advanced search, and large document viewing and server-side search. PrizmDoc Viewer is a cornerstone within thousands of enterprise software solutions across the globe and users have great things to say about its functionality. Along with simple deployment and development, HTML5 document viewers offer other benefits such as:

  • Support for multiple software platforms and browsers to ensure consistent viewing across enterprise environments.
  • Ability to view a wide variety of file format types used by healthcare, legal, manufacturing, government, and financial firms.
  • Reduced need for other viewing applications and software licences.
  • Seamless web and mobile-friendly viewing experiences that rely on HTML5 markup rather than device-specific constraints.

How does Accusoft’s HTML5 viewer work?

PrizmDoc Viewer makes it easy to present DOCX, PPT, PDF, TIFF, email, and a host of other file types as part of your existing web application. To achieve this high-speed, high-fidelity document and image viewing, PrizmDoc Viewer leverages three key components:

  • The HTML5 viewer itself, which runs in-browser to display content.
  • The backend, comprised of PrizmDoc Application Services (PAS) and PrizmDoc Server, which handles document processing.
  • Your web application, acting as a reverse proxy, that sits between the HTML5 viewer and the backend to manage content requests.

PrizmDoc Server is the technical heart of the product, the engine that drives document conversion. It takes on the task of converting document pages to SVG using a compute-intensive process and has no permanent storage. 

The PrizmDoc Server handles the heavy lifting but doesn’t hang on to any document pages, while PAS acts much like your own web application. It has privileged access to your document storage solution — such as a file system or database — to deliver key functionality including the long-term caching of pre-converted content and the loading and saving of document annotations. 

Next up: How does this all work? Think of it like a conversation. First your web application POSTs to PAS and asks for a new viewing session. PAS responds with a new ViewingSessionID. This lets your web app render the page HTML and pass it along to the in-browser document viewer, while simultaneously delivering original documents to PAS.

PAS talks to PrizmDoc Server, asking it to start conversion. Meanwhile, the document viewer has its own question for the PAS (via your web proxy): Can I have the first page now? Once available, PAS sends the first page back as an SVG even as other pages are still being converted, letting users view and interact with documents while conversion is underway. 


What can PrizmDoc Viewer do for you?

PrizmDoc Viewer supports 100+ file types using our zero-footprint document viewer and content conversion REST API. The viewer includes an advanced HTML control which allows users to view, search, redact, print, and download documents in many different file formats – from Adobe PDFs and Microsoft Office files to CAD and DICOM – right in their browser. They don’t ever need to leave your application and risk data security. 

Process performance is one thing, but what can PrizmDoc Viewer do in practice? Our interactive demos showcase how PrizmDoc Viewer’s functionality would operate after integration, but there are many use cases that we’ve yet to explore. How does HTML5 document viewing directly impact enterprise workflows? Here are some examples of why PrizmDoc Viewer customers integrate our functionality into their own software:

  • Display All Document Types Quickly and Accurately By eliminating the need for multiple, file-specific applications, PrizmDoc Viewer allows staff to review and provide feedback on any file using a single, common interface, both improving overall performance and reducing your total licensing costs. 
  • Provide High-Fidelity Renderings of Microsoft Office Formats Accurate, high-fidelity, in-browser renderings of Word, Excel, and PowerPoint documents are often required to ensure regulatory compliance. PrizmDoc Viewer offers true native viewing of these popular file types on-demand. 
  • Increase the Performance of Large Document Viewing and Search FunctionsPrizmDoc Viewer utilizes both document pre-loading and customizable, server-side search parameters to help your business improve document access and offload the heavy lifting of text-based search to ensure browsers aren’t overloaded.
  • Deliver Secure CollaborationBy integrating PrizmDoc Viewer into existing applications, companies can ensure trust and compliance of critical documentation with key security controls. 

It can be extremely time-consuming to keep up with evolving user expectations, regulatory requirements, and file format complexity. That’s why many development teams decide to integrate an HTML5 viewer instead of developing one in-house. PrizmDoc Viewer offers the ability to streamline key functions and enhance security with HTML5-native support to deliver document viewing, annotation, redaction, and conversion on-demand. 

Ready to jump start your PrizmDoc Viewer development? Get started with our Docker evaluation here. To learn more about PrizmDoc Viewer and all of its unique features and functions, download our What is an HTML5 Viewer? whitepaper to learn more.

Screenshot

AI Legal Tech startups received more than $700 million in venture funding in 2023. Additionally, 90% of large law firms expect to increase their investment in generative AI in the next five years. And by 2036, 114,000 legal jobs are expected to be automated.

What does all of this mean? Independent software vendors (ISVs) are in an excellent business position. Firms know that their need for AI tools to help them build stronger cases is critical, and ISVs are uniquely positioned to fill that need.

AI Legal Tech Trends and How ISVs Can Respond to Them

#1: Generative AI Capabilities Are in High Demand

There’s been a marked shift from using traditional AI to generative AI. Given that 70% of law firms agree that generative AI adds value, let’s define this key term. Generative AI is an advanced branch of AI that focuses on creating new content, such as text, images, and videos, based on learned patterns and data inputs.

In the legal industry, traditional AI can analyze and process case data to make it easier for teams to develop arguments and documents themselves. Generative AI, however, can generate that content on its own–legal documents, contracts, and briefs–based on learned patterns and data inputs, mimicking the creativity and logic of human attorneys. The demand for generative AI capabilities in law firms is growing, and ISVs must respond accordingly.

Action Items

  • If you’re not already, invest in research and development to enhance generative AI capabilities tailored for your clients. This includes algorithms and models trained on legal data and patterns to generate legal documents accurately and efficiently.
  • Provide case studies, demonstrations, and ROI analysis showcasing the value of generative AI in legal operations. Highlight efficiency gains, cost savings, and improved productivity achieved through AI-driven document generation and analysis.
  • Establish a feedback loop with your clients to gather insights and feedback on the usability, accuracy, and effectiveness of your generative AI legal tech solutions. Improve and enhance AI capabilities over time.

#2: AI Legal Tech Streamlines Document Review

One of the most tedious aspects of legal operations has been manual document review. The number of data sources where client information is stored is only increasing, as is the volume of data stored within those sources. What does that mean for attorneys? They have to spend a mass amount of time and attention on this rote activity, which cuts into time they could be spending on more strategic, client-supporting work.

That’s where ISVs can bring value. AI legal tech tools resolve the pain of manual document review by automating the process. AI-driven technologies can actually reduce document review by up to 70% on average. With that time savings think about all of the other ways legal teams can then support their clients or even take on new clients.

Keep in mind that AI’s impact on document review is not just about efficiency. AI legal tech tools for document review also enhance the quality of legal work. By automating repetitive tasks, AI allows legal professionals to focus on higher-value activities such as legal analysis, strategy development, and client interaction. This not only improves overall productivity but also enhances client satisfaction and loyalty, positioning law firms for long-term success in a competitive marketplace. 

When developing or enhancing AI legal tech tools, ISVs would do well to consider how the tools can help attorneys save time and energy and deliver higher-quality work and service.

Action Items

  • Enhance your software by integrating PrizmDoc, designed to expedite legal teams’ document review process. PrizmDoc enables teams to securely view, annotate, redact, and more on a variety of file formats within your application. 
  • Offer advanced NLP features that extract key information, identify patterns, and improve accuracy in document analysis and review.
  • Develop algorithms that can group similar documents together based on content and context, making it easier for legal professionals to navigate and manage large volumes of documents efficiently.

#3: Identification of PII is Easier in High Volumes of Data

Protecting PII is a paramount concern in the legal sector, especially given the number of data sources that exist and the amount of data that is exchanged within those sources. There are so many opportunities for PII to be exposed or for a document to be tampered with. If either of those things happens, a document will likely be thrown out of a case. Make case preparation easier for your clients by offering them AI-powered technologies that keep their documents inside of a secure software application environment. 

Action Items

  • Explore integrating your software with PrizmDoc, which helps teams safeguard documents in several ways. The tool’s AI capabilities help teams identify and redact PII automatically. Additionally, PrizmDoc enables teams to view and collaborate on a variety of file formats in a single, secure environment. 
  • Develop data encryption and access control features to strengthen your software’s security. Offer granular access control mechanisms that allow law firms to define user roles, permissions, and restrictions to safeguard sensitive information.
  • Offer the capability to track and record actions related to PII handling within documents to create transparency and accountability among legal teams.

#4: Legal Services Are Poised to Become Less Cost Prohibitive

A much-needed transformation that AI has brought to the field is making legal services more financially accessible. AI is making this happen in several different ways.

First, AI legal tech tools help legal teams be more efficient on certain tasks, which results in less billable time for clients. For example, as we said earlier, automating document review can save up to 70% of time, significantly reducing the billable hours that a firm can charge to a client.

Second, thanks to AI, the nature of client billing itself is changing. By embracing AI, law firms can offer fee arrangements instead of traditional hourly billing. Because AI legal tech tools free up attorneys to focus on more strategic work, firms can offer transparent, value-based pricing models that prioritize only the value provided rather than hours spent on manual tasks. 

There are no signs of firms slowing their adoption of AI. 82% of attorneys agree that generative AI can be applied to legal work, and more than half believe that AI should be applied to legal work. More and more teams are beginning to use AI within their firms, setting a foundation for taking on more clients and/or serving clients at a more competitive rate.

Action Items

  • Integrate predictive analytics capabilities that help legal teams forecast budgets, identify cost-saving opportunities, and make data-driven decisions in resource allocation.
  • Build platforms that utilize AI to streamline document review processes–through automatic PII identification and redaction, as well as the ability to view and annotate documents in one platform–so that this service is more affordable. 
  • Develop AI-powered virtual assistants that can handle routine inquiries, provide legal guidance, and assist in basic legal tasks, reducing the need for human resources and overhead costs.

#5: AI Legal Tech Tools Can Be Reliable Assistants on Research and Due Diligence

Another trend that ISVs should note is that AI legal tech tools can serve as true assistants on certain tasks, especially for research and due diligence. For example, the research process often takes days. With an AI legal tech tool, research can take mere hours. This translates into substantial cost savings and improved productivity for legal professionals and law firms. With AI helping to expedite the research and due diligence process, legal teams have more time and energy to build the strongest cases possible for their clients.

Action Items

  • Offer AI legal tech tools that can analyze case law, identify legal precedents, and provide contextual insights to support legal research and due diligence efforts.
  • Develop features that enable visual representation of legal data, trends, and insights to facilitate decision-making and enhance the quality of legal research outputs.
  • Develop AI models that can analyze and extract key clauses, terms, and risks from contracts. This would enable faster due diligence processes and better risk management for law firms.

How to Bring the Benefits of AI Legal Tech to Your Clients

ISVs and technology leaders who thrive are those who grasp the rapid growth of AI while tailoring their tools to accommodate the varying paces at which law firms navigate this space. As the AI software market in the legal sector is projected to grow by over 10% annually, from $2.19 billion in 2024 to an estimated $3.64 billion by 2029, adaptability and responsiveness to the evolving needs of law firms on their AI journey become paramount for long-term success.

By integrating AI-driven solutions like Accusoft’s PrizmDoc into your offerings, you can help legal teams streamline document management, safeguard documents, and build airtight cases for their clients. 

We invite you to explore how Accusoft can help you provide better AI legal tech tools to your clients. Check out our Legal Tech Fact Sheet to learn more about PrizmDoc and discover how our solutions can elevate your AI legal tech capabilities.

 

The industry-wide push to digitize documents and minimize the use of physical paperwork has made PDF one of the most ubiquitous file formats in use today. Business and government organizations use PDFs for a variety of document needs because they can be viewed by so many different applications. When it comes to archiving information, however, PDFs have a few limitations that make them unsuitable for long-term storage. That’s why many organizations require such files to be converted into the more specialized PDF/A format.  Learn how easy it is to convert PDF to PDF/A with ImageGear.

What Is PDF/A?

Originally developed for archival purposes, the PDF/A format is utilized for long-term preservation that ensures future readability. It has become the standard format for the archiving of digital documents and files under the ISO 19005-1:2005 specification. Government organizations are increasingly utilizing PDF/A to digitize existing archival material as well as new documents.

The distinctive feature of PDF/A format is its universality. Although PDFs are well entrenched as the de facto standard for digital documents, there are many different ways of assembling a PDF. This results in different viewing experiences and sometimes makes it impossible for certain PDF readers to even open or render a file. Because PDF/A documents need to be accessible in the indeterminate future, there are strict requirements in place to ensure that they will always be readable.

PDF vs PDF/A

While PDF and PDF/A are based upon the same underlying framework, the key difference has to do with the information used to render the document. A standard PDF has many different elements that make up its intended visual appearance. This includes text, images, and other embedded elements. Depending upon the application and method used to create the file, the information needed to render those elements may be more or less accessible for a viewing application.

When a PDF viewer cannot access the necessary data to render elements correctly, the document may not display correctly. Common problems include switched fonts (because the original font information isn’t available), missing images, and misplaced layers.

A PDF/A file is designed to avoid this problem by including everything necessary to display the document accurately. Fonts and images are embedded into the file so that they will be available to any viewer on any device. In effect, a PDF/A doesn’t rely on any external dependencies and leaves nothing to chance when it comes to rendering. The document will look exactly the same no matter what computer or viewing application is used to open it. This level of accuracy and authenticity are important when it comes to archival storage, which is why more organizations are turning to PDF/A when it comes to long-term file preservation.

How to Convert PDF to PDF/A

ImageGear supports a broad range of PDF functionality, which includes converting PDF format to a compliant PDF/A format. It can also evaluate the contents of a PDF file to verify whether or not it was created in compliance with the established standards for PDF/A format. This is an important feature because it will impact what method is used to ultimately convert a PDF file into a PDF/A file.

Verifying PDF/A Compliance

By analyzing the PDF preflight profile, ImageGear can detect elements of the file to produce a verifier report. The report is generated using the ImGearPDFPreflight.VerifyCompliance method. 

It’s important to remember that this feature does NOT change the PDF document itself. The report also will not verify annotations that have not been applied to the final document itself. Once the report is generated, a status code will be provided for each incompliant element flagged during the analysis. 

These codes can have two values:

  • Fixable: Indicates an incompliance that can be fixed automatically during the PDF/A conversion process.
  • Unfixable: Indicates a more substantial incompliance that will need to be addressed manually before the document is converted into PDF/A.

Converting PDF to PDF/A

After running the verification, it’s time to actually convert the PDF to PDF/A. The ImGearPDFPreflight.Convert method will automatically perform the conversion provided there are no unfixable incompliances. This process will change the PDF document into a PDF/A file and automatically address any incompliances flagged as “Fixable” during the verification process.

While it is not necessary to verify a PDF before attempting conversion, doing so is highly recommended. Otherwise, the document will fail to convert and return an INCOMPLIANT_DOCUMENT code. The output report’s Records property will provide a detailed report of incompliant elements. Since any “Fixable” incompliances would have been addressed during conversion, the document’s remaining issues will need to be handled manually.

This method is best used when manual changes need to be made to the PDF file prior to conversion. One of the most common changes, for example, is making the PDF searchable. Once the alterations are complete, the new file can be saved using the ImGearPDFDocument.Save method.

Other ImageGear PDF to PDF/A Conversion Methods

Raster to PDF/A

ImageGear can save any PDF file produced directly by a raster file as a PDF/A during the initial conversion. A series of automatic fixes are performed during this process to ensure compliance.

  • Uncalibrated color spaces are replaced with either a RGB or CMYK color profile. This could change the file size.
  • Any LZW and JPEG2000 streams are recompressed since PDF/A standards prohibit LZW and JPEG 2000 compression.
  • All document header and metadata values are automatically filled in to comply with PDF/A requirements.

Quick PDF to PDF/A Conversion

For quick conversions in workflows that don’t require displaying or working with a file in any way, the ImGearFileFormats.SaveDocument method is another useful option. This process loads the original file, converts it, and saves the new version all at once. It’s important to set the PreflightOptions property to be set in the save options. Otherwise, the new document will not save as a PDF/A compliant file.

Take Control of PDF/A Conversion with ImageGear

Accusoft’s versatile ImageGear SDK provides enterprise-grade document and image processing functions for .NET applications. With support for multiple file formats, ImageGear allows developers to easily convert, compress, and optimize documents for easier viewing and storage.

ImageGear takes your application’s PDF capabilities to a whole new level, delivering annotation, compliant PDF to PDF/A conversion, and other manipulation tools to meet your workflow needs. Learn more about how ImageGear can save you time and resources on development by accessing our detailed developer resources.

Financial institutions are spending on technology. As noted by IDG Connect, solutions such as AI-driven analysis and investment tools could boost revenue by 34 percent. In addition, 72 percent of senior management view artificial intelligence and machine learning (ML) as critical market advantages.

It makes sense. Banks, credit unions, and fintech firms must now meet evolving consumer expectations and satisfy emerging compliance legislation. The challenge? Ensuring existing processes — such as check image handling at ATMs and data verification during loan applications — are both streamlined and secure.

Fortunately, there’s a simple starting point: image processing.

 


 

Bridging the Data Divide

According to a recent Accenture survey, several emerging trends now inform the consumer landscape in finance. What’s the most important to data-driven organizations? Trust. While 67 percent of clients will now permit banks access to more personal data, 43 percent cite trust as the biggest driver of long-term loyalty. What’s more, 63 percent want banks’ use of personal data to drive more individualized, value-added services.

ATMs provide a key component of this data-driven strategy. For example, many ATMs use the X9.100-181 standard to store and secure .tif files. To ensure customers and bank staff have access to the right data at the right time, companies need image software capable of capturing, processing, and manipulating these images in real-time — in turn underpinning the development of agile web-based and mobile applications that engender consumer trust.

 


 

Processing, Permission, and Potential

Also critical for banks? Compliance. Consider the evolving standards of GDPR. As noted by Forbes, the regulation includes provisions for the right to access, which entitles consumers to information about how and why their data is processed by organizations.

Given the sheer volume of data now processed by financial institutions — and the growing risk of network data breaches — meeting compliance expectations is both time and resource intensive. Add in the increasing number of consumers now submitting checks via ATMs or mobile deposit software, and companies face the problem of accidental data misuse. What happens if check or loan data is shared across departments but customers haven’t specifically given their permission?

Redaction can provide the security you need to keep sensitive information secure. By combining ease of capture with straightforward redaction services, it’s possible for banks to ensure that check and application X9.100-181 .tif data is natively secured, in turn limiting potential compliance pitfalls.

 


 

Controlling Complexity: Not Always Black and White

In the years following 2008’s nationwide financial collapse, many financial firms drafted long-term plans designed to reduce complexity and streamline operations. According to a recent Reuters piece, however, despite ambitious plans “the level of complexity remains high for U.S. banks.”

Here, consumer expectations and compliance demands conspire to increase total complexity. From cloud-based technologies to mobile initiatives and ongoing compliance evaluations, streamlining processes often takes a back seat to mission-critical operations. Check imaging and recognition is no exception. Companies need tools capable of handling color, black and white, and multi-layered images. The solution? Powerful software development kits (SDKs) that integrate with existing apps to provide on-demand functionality.

 


 

Piece by Piece

Meeting consumer expectations, satisfying compliance requirements, and reducing complexity is a multi-faceted, ongoing process for financial organizations.

Accusoft’s ImagXpress SDK provides a critical piece of this long-term goal with support for 99 percent of common financial image types, optimized compression/decompression technology to minimize wait times, and enhanced redaction and editing capabilities to empower ATM, loan application, and mobile app image processing. Learn more about Accusoft’s SDKs and APIs here.

This is a hot topic for many software vendors. They often have an enterprise content management system that brings data together, but they lack document viewing and processing functionality. They want to make document management seamless without requiring users to download and open documents outside the ECM. 

Sound familiar? If so, you’re likely wondering whether you should use costly developer time or turn to a ready-made software integration. 

In this article, we’ll explore the pros and cons of each option to help you decide if you should build your own solution or work with a software partner who specializes in content processing, conversion, and automation features. 

Pros of Building Your Own Software Features

Control and Customization

Many software vendors gravitate toward the build-your-own model because it gives them full autonomy. They control the design, development, and functionality of the features. With greater control, ISVs can ensure their solution addresses their document management needs. 

As the business evolves, software vendors can easily make updates or changes without the constraints of existing software limitations. You don’t have to compromise to get exactly what you want. 

Competitive Advantage

Custom features offer capabilities most off-the-shelf solutions can’t match. Software vendors looking to sharpen their competitive edge may opt to build their own features so they can differentiate their solutions. 

Long-Term Cost Savings

Building your own features can be more cost-effective in the long term. The initial development costs might be higher, but over time, you won’t have to pay for ongoing licensing fees or subscriptions.

Cons of Building Your Own Software Features

Longer Time to Market

Building extended functionality from scratch takes considerable time. It can take in-house teams years to build out extensive functionality and document support. For software vendors needing a quick solution to keep up with the speed of tech, the build-your-own route isn’t an ideal option. 

Resource Intensive

Not only does the build process take time, but it requires the right resources. A software partner can reduce the workload significantly so your in-house teams aren’t working at full capacity on a single project. 

Technical Expertise Required

Even if you’re able to dedicate the time and resources, you need the right experts on your team. Developing extended functionality for your document viewing and processing needs requires skills that may not be readily available in your organization. 

Higher Upfront Costs

Developing custom software features in-house generally requires a significant upfront investment. You’ll need the resources to hire developers, purchase the right development tools and infrastructure, and cover other related expenses. 

Ongoing Maintenance and Support

After you’ve developed the software features, you have to keep up with the maintenance and support. This includes fixing bugs, performing updates, and making sure the software remains compatible with new technologies. 

Recap: Building Your Own Software Features 

Pros: 

  • Ability to customize 
  • Can differentiate features 
  • Cost-effective over time

Cons: 

  • Takes longer
  • Requires more resources 
  • Requires dedicated expertise 
  • More up-front costs needed
  • Limited maintenance and support 

Pros of Working with a Software Partner 

Specialized Expertise

Working with a software partner, you benefit from their knowledge of specific technologies, frameworks, and development methodologies. This specialized expertise is critical when you’re building complex or niche features that might be just outside your core team’s skill set. 

Faster Time to Market

Most software partners can ramp up quickly. They often have established processes and dedicated teams to jump in right away, reducing development time so you can bring new features to market faster

Reduced Risk

Reputable software partners have established quality assurance processes and rigorous security standards to protect users. They can also provide ongoing maintenance and support. This can mean fewer risks during software development and more long-term stability for your solution. 

When you work with a company like Accusoft that’s been around for decades, you get the reassurance of knowing your partner will be there when you need them. 

Greater Focus on Core Business

Time spent away from your core business can impact productivity and innovation. By delegating extended feature development to a partner, you can focus your team’s energy on your primary business activities and initiatives.  

Cons of Working with a Software Partner 

Higher Costs

Outsourcing development to a software partner often comes with higher costs than handling development in-house. There may also be ongoing fees, licensing costs, and potential price increases to consider. 

Recap: Working with a Software Partner 

Pros: 

  • More specialization 
  • Faster delivery 
  • Few risks and greater stability 
  • More time to focus on your core business 

Cons: 

  • Can be more costly

Open Source: A Hybrid Option? 

Sometimes the decision to build or buy isn’t always black or white. Software vendors may choose to incorporate open source into their software solutions because it’s often free to use. 

However, open source has several major disadvantages. Security is a significant concern. Open-source software may contain security vulnerabilities that are publicly disclosed, and these vulnerabilities are targets for malicious actors. In 2023, 74% of codebases assessed in Synopsys’ Open Source Security and Risk Analysis (OSSIRA) report included high-risk open source vulnerabilities. 

You’ll also encounter challenges in keeping open-source software up to date. If you don’t have the dedicated resources and expertise to make updates, you may put your users’ security at risk. In the Synopsys report, 91% of codebases analyzed contained components that were 10 or more versions out of date. 

Plus, open source often lacks consistent support. This means you may have difficulty getting timely assistance when you run into problems. 

Why Choose Accusoft 

As your software partner, Accusoft can quickly deliver a document viewing and processing solution to give your app greater functionality while reducing your development costs. Through a collection of APIs, PrizmDoc allows for efficient and secure document viewing, conversion, and annotations within your ECM application. 

With IBM watsonx.ai technology, PrizmDoc offers AI-powered functionality for automated summarization, Q&A, PII detection and redaction, and intelligent tagging and classification. You can bring AI to market faster without hiring a dedicated AI team. 

Ready to learn more and take PrizmDoc for a spin? Schedule a demo today! 

Accusoft is proud to be a company that develops and goes to market with mature, standards-based APIs and SDKs. One of the most compelling benefits to our structure is that our customers and partners often independently purchase, download, and extend their business-critical applications with minimal support.

However, that makes it challenging to have meaningful discussions with the people who purchase, use, and integrate our code. In an effort to better understand our clients and their use cases, we decided to host a Consumer Advisory Board (CAB) to have substantive conversations with them. The event and its members were named the Board of Connectors (BoC). During the meeting, BoC discussed:

  • The specific use cases they are addressing with our SDKs and APIs
  • The business challenges they are overcoming by developing innovative solutions with our tools
  • What they like about Accusoft’s content processing, conversion, and automation toolkits, and what functional gaps exist based on comparable tools they’ve tested and used from the marketplace
  • The value they see in our existing products and their opinions on new products in our portfolio

Pragmatic Marketing teaches us that CABs are extremely important to a company’s success, especially a company like Accusoft, which is focused on long-term customer relationships, license subscription renewals, and integrated solutions which scale.

Launching our Board of Connectors

We’ve always strived to map our product roadmaps to customer input across multiple channels. Yet to enhance these efforts, we recently launched our CAB with a select group of clients in our home city of Tampa, Florida. The event was an excellent opportunity for Accusoft executives, our Customer Relationship team, our engineering team, and our clients to have thought-provoking, engaging face-to-face discussions.

We are confident that our BoC will enable our product management team to introduce products and functionality which align with our customers’ priorities for:

  • Information worker productivity and effectiveness
  • Data and content security and integrity
  • Application encryption for PrizmDoc Cloud
  • Regulatory compliance
  • Discoverability and viewability of files and text across mobile devices, computing, and browser platforms

“The BoC will create quick and long-term wins for Accusoft, and for our partner and customer ecosystem as well. It will create wins for our clients too, and ensure we develop solutions and product features which address real-world problems,” says Steve Wilson, VP of Product. “We’ll continue to enable customers to build document management functionality which complements their core applications. Finally, it will help us tailor our messaging about our products to speak to their priorities and interests.”

Long-Term Benefits for All Accusoft Customers

Even though Accusoft has been in business for over a decade, we are continually looking for new ways to align our product roadmap with our customers’ needs. Our customer relationships extend well beyond the initial purchase transaction into license renewals, training, and incremental add-ons. We designed the board to provide better networking opportunities amongst the attendees and open an ongoing dialogue that can extend far beyond the launch event.

“I came to know the people at Accusoft and got a sense of their professionalism and passion for their products and company. The event helped solidify our trust, knowing we chose the right partner who will support us during both good and more challenging times,” says Christopher Benes, VP of Software Engineering at Donnelley Financial Solutions, a member of the first BoC.

Customers outside of our founding board members undoubtedly experience common challenges that our focus group unveiled. Yet we always welcome more input from our customers and prospects. We love to create and share case studies and testimonials about how our products deliver value, and the BoC will go a long way to helping us craft compelling success stories. In addition, our clients found substantial value in the event as well.

“There were multiple benefits of this event. We had the ability to connect with the Accusoft team to understand the products, the company strategy, and the product roadmap. We gained a better relationship with Accusoft and feel we can now work together to help our clients,” reports Imran Aziz, Senior Technical Product Manager, HighQ Solutions and a member of the first BoC. “We also connected with other Accusoft clients to understand how they are using the products. We gained knowledge of when and whom to connect with in case there is an urgent requirement. Plus, we had the opportunity to explore Tampa for the first time.”

We’ve built our BoC with representation from a cross-section of the industries in our customer roster. This is to ensure our meetings, and the actions we take as a result of them, address the priorities of customers on our BoC and beyond. That includes SaaS companies that have CABs of their own. We are following best practices as we build and nurture our BoC in the months and years ahead.

Are you an Accusoft client and interested in participating in our next CAB, either to submit product feedback or learn about our product roadmap?

Contact us today!