Technical FAQs

Question

In ImageGear .NET, I am receiving error “API_HARDTIMEOUT_ERR” when using Recognize() to OCR a document. What is happening and how can I fix it?

Sample case: I have a large PDF I was processing page-by-page. The first 15 pages took six minutes, but the 16th page took three minutes and produced that error.

Answer

API_HARDTIMEOUT_ERR can occur when ImageGear has taken too long to process your document. This tends to happen when the OCR process is spending too much time on things it thinks are characters (very common in bitonal documents), such as, scan artifacts in damaged documents, visual marks (e.g, the distortion of a camera picture of a computer monitor), or other marks that the recognition engine would waste time on because it thinks they’re letters. See the bottom of this page for an example.

For scanned bitonal documents, running a Despeckle operation on the page can help reduce the amount of noise obstructing the OCR process.

ImGearRasterPage igRasterPage = p.Rasterize(1, 300, 300);
if (ImGearRasterProcessing.Verifier.CanApplyDespeckle(igRasterPage))
    ImGearRasterProcessing.Despeckle(igRasterPage, 3, 3);

Also, if converting documents to bitonal is part of the document process, ImageGear .NET has reducing methods that may make for a less damaged document, such as our Reduce method with configurable parameters. Alternately, the color document could be OCR’d instead with likely better results.

In the past, some users have found some success adjusting some of the time-based parameters in the recognition engine. ImGearRecTradeoff and DecompMethod can be modified to trade-off accuracy for speed during the actual OCR process, and Locate can be used to identify existing text before recognition.

First Google Images result for "Badly Scanned Document"

Question

The logging for ImageGear C & C++ Deployment Packaging Wizard (DPW) is showing different output for some components since v19.3, why is this?

In ImageGear C & C++ v19.2 and prior, the DPW had additional logging information for the ARTX component in its deployment.log:

Deploying an application that uses the ARTXGUI library of ImageGear
ARTX Component requires the following merge modules to be installed:

Microsoft_VC90_CRT_x86_x64.msm

Microsoft_VC90_MFC_x86_x64.msm

But since v19.3, the logs are no longer telling me to install these modules. Is this a mistake, or are they no longer necessary?

Answer

This was an intentional change on our end, and the Deployment Packaging Wizard (DPW) is working as intended. We made some updates to the DPW in the latest release; one update is that the CRM requirements for CORE (which is required in every project) now also covers the ARTX component. If the DPW is not saying you need additional components to use the ARTX component, then you’ll be fine.

Question

When using OCR in ImageGear .NET, is there any way to distinguish between a capital/uppercase letter O and the number 0?

Answer

Not without context or a font that makes the difference clear (such as one with a slashed 0). ImageGear will properly recognize Oliver and 1530 as containing O and 0, respectively, but cannot reliably distinguish it when letters and numbers are mixed. That is, ImageGear may not reliably distinguish between 1ABO0F3 and 1AB0OF3.

Question

For ImageGear .NET, what are the feature differences between an OCR Standard license, an OCR Plus license, and an OCR Asian license?

https://www.accusoft.com/products/imagegear/pricing/

Answer

ImageGear’s OCR library has three different functionality options that you can choose for your website or application. The primary difference between the three options is the output formats created by the OCR engine. The options for your development are as follows:

  1. OCR Standard:
    The standard edition creates output formats for Western languages such as English. The standard edition outputs text only files and generates a PDF. The file formats it includes are searchable text PDFs and text documents.

  2. OCR Plus:
    The standard plus edition creates formatted outputs for Western languages like English. The formatted output is created with recognition technology that identifies font detail, locates image zones, and recognizes table structure in order to create a representation of the original document. The file formats it includes are Word, Excel, HTML, searchable PDF, and text documents.

  3. OCR Asian:
    The Asian edition creates a formatted output for Asian languages like Chinese, Japanese, and Korean. This formatted output is created with the same recognition technology as the Standard Plus that identifies font detail, locates image zones, and recognizes table structure. It also creates a representation of the original file. Formats include Word, Excel, HTML, searchable PDF, and text documents.

Question

How do I use a Network Drive path for Image and ART storage in my ImageGear .NET web application?

Answer

In an ImageGear .NET web application, you have to define the location of the images and annotations directory in the storageRootPath and artStorageRootPath configuration property.
In the current version of ImageGear .NET, the storageRootPath and artStorageRootPath do not work with a network drive path \\SERVER-NAME\sharefilename.

The workaround for this would be to create a Symbolic link from a local directory to the network drive directory.

  • To create a symbolic link: Open “Command Prompt” as Administrator and type in > mklink /d "local path" \\SERVER-NAME\sharefilename
  • Pass in the path of the symbolic link as image or art storage root path in your web.config: storageRootPath="local path" artStorageRootPath="local path"
Question

I encounter an Unhandled Exception error, as shown below, in ImageGear when trying to load a page into the recognition engine.

Error Message: An unhandled exception of type
‘ImageGear.Core.ImGearException’ occurred in ImageGear22.Core.dll

Additional information: IMG_DPI_WARN (0x4C711): Non-supported
resolution. Value1:0x4C711

What is causing this and how can I fix it?

Answer

This is probably because the original image used to create the page didn’t have a Resolution Unit set.

Resolution unit not set in original image

To fix this, check if the page has a Resolution Unit set. If it does not, set it to inches. You should also set the DPI of the image as those values were probably not carried over from the original image since the Resolution Unit wasn’t set. The following code demonstrates how to do this.

// Open file and load page.
using (var inStream = new FileStream(@"C:\Path\To\InputImage.jpg", FileMode.Open, FileAccess.Read, FileShare.Read))
{
    // Load first page.
    ImGearPage igPage = ImGearFileFormats.LoadPage(inStream, firstPage);

    if (igPage.DIB.ImageResolution.Units == ImGearResolutionUnits.NO_ABS)
    {
        igPage.DIB.ImageResolution.Units = ImGearResolutionUnits.INCHES;
        igPage.DIB.ImageResolution.XNumerator = 300;
        igPage.DIB.ImageResolution.XDenominator = 1;
        igPage.DIB.ImageResolution.YNumerator = 300;
        igPage.DIB.ImageResolution.YDenominator = 1;
    }

    using (var outStream = new FileStream(@"C:\Path\To\OutputImage.jpg", FileMode.OpenOrCreate, FileAccess.ReadWrite))
    {
        // Import the page into the recognition engine.
        using (ImGearRecPage recognitionPage = recognitionEngine.ImportPage((ImGearRasterPage)igPage))
        {
            // Preprocess the page.
            recognitionPage.Image.Preprocess();

            // Perform recognition.
            recognitionPage.Recognize();

            // Write the page to the output file.
            recognitionEngine.OutputManager.DirectTextFormat = ImGearRecDirectTextFormat.SimpleText;
            recognitionEngine.OutputManager.WriteDirectText(recognitionPage, outStream);
        }
    }
}
Question

ImageGear .NET v24.6 added support for viewing PDF documents with XFA content. I’m using v24.8, and upon trying to open an XFA PDF, I get a SEHException for some reason…

SEHException

Why might this be happening?

Answer

One reason could be because you need to execute the following lines after initializing the PDF component, and prior to loading an XFA PDF:

// Allow opening of PDF documents that contain XFA form data.
IImGearFormat pdfFormat = ImGearFileFormats.Filters.Get(ImGearFormats.PDF);
pdfFormat.Parameters.GetByName("XFAAllowed").Value = true;

This will enable XFA PDFs to be opened by the ImageGear .NET toolkit.

Question

I want to re-arrange the page order of a PDF. I’ve tried the following…

var page = imGearDocument.Pages[indx].Clone();

imGearDocument.Pages.RemoveAt(indx); //// Exception: "One or more pages are in use and could not be deleted."

imGearDocument.Pages.Insert(newIndx, page);

But an exception is thrown. Somehow, even though the page was cloned, the exception states that the page can’t be removed because it’s still in use.

What am I doing wrong here?

Answer

If you’re using ImageGear .NET v23 (possibly earlier), you’ll run into this exception when you clone the page. I believe some of the resources between the original and the clone are still shared, which is why this happens.

Starting with ImageGear .NET v24.8, this no longer happens, and your code should work fine.

If you still need to use the earlier version, you can use the InsertPages method instead.

Java PDF Project with ImageGear
In this tutorial, you’ll learn how to configure a Java project for a console application. You’ll also learn how to open a PDF and save as a new file.

    1. Make sure that you have installed JDK and ImageGear for Java PDF properly. See System Requirements and Installation. You will need to copy a sample.pdf file inside the directory where you will be creating the tutorial sample.
    2. Create a new Java file, and name it (e.g., MyFirstIGJavaPDFProject.java). Insert the following code there:
import com.accusoft.imagegearpdf.*;
public class MyFirstIGJavaPDFProject
{
   private PDF pdf;
   private Document document;
   static
   {
       System.loadLibrary("IgPdf");
   }
   // Application entry point.
   public static void main(String[] args)
   {
       boolean linearized = false;
       String inputPath = "sample.pdf";
       String outputPath = "sample_output.pdf";;
       MyFirstIGJavaPDFProject app = new MyFirstIGJavaPDFProject();
       app.loadAndSave(inputPath, outputPath, linearized);
   }
   // Load and save the PDF file.
   private void loadAndSave(String inputPath, String outputPath, boolean linearized)
   {
       try
       {
           this.initializePdf();
           this.openPdf(inputPath);
           this.savePdf(outputPath, linearized);
       }
       catch (Throwable ex)
       {
           System.err.println("Exception: " + ex.toString());
       }
       finally
       {
           this.terminatePdf();
       }
   }
   // Initialize the PDF session.
   private void initializePdf()
   {
       this.pdf = new PDF();
       this.pdf.initialize();
   }
   // Open input PDF document.
   private void openPdf(String inputPath)
   {
       this.document = new Document();
       this.document.openDocument(inputPath);
   }
   // Save PDF document to the output path.
   private void savePdf(String outputPath, boolean linearized)
   {
       SaveOptions saveOptions = new SaveOptions();
       // Set LINEARIZED attribute as provided by the user.
       saveOptions.setLinearized(linearized);
       this.document.saveDocument(outputPath, saveOptions);
   }
   // Close the PDF document and terminate the PDF session.
   private void terminatePdf()
   {
       if (this.document != null)
       {
           this.document.closeDocument();
           this.document = null;
       }
       if (this.pdf != null)
       {
           this.pdf.terminate();
           this.pdf = null;
       }
   }
}
  1. Now, let’s go over some of the important areas in the sample code with more detail. The com.accusoft.imagegearpdf namespace:
    • Allows you to load and save native PDF documents
    • Allows rasterization of PDF pages by converting them to bitmaps and adding raster pages to a PDF document
    • Provides multi-page read and write support for the entire document

    To enable the com.accusoft.imagegearpdf namespace in your project, specify the following directive:

    import com.accusoft.imagegearpdf.*;
    

    To initialize and support processing of PDF files we need:

    // Initialize the PDF session.
    private void initializePdf()
    {
       this.pdf = new PDF();
       this.pdf.initialize();
    }
    
  2. There is one main object that is used in this sample code: The Document that holds the entire loaded document.
    private Document document;
    …
    this.document = new Document();
    this.document.openDocument(inputPath);
    
  3. You can save the loaded document using:
    SaveOptions saveOptions = new SaveOptions();
    // Set LINEARIZED attribute as provided by the user.
    saveOptions.setLinearized(linearized);
    this.document.saveDocument(outputPath, saveOptions);
    

    See About Linearized PDF Files for more information.

  4. Now, you can build and run your sample. Please make sure that you have a PDF file named sample.pdf in the same directory where your sample source resided, or change the inputPath in the sample code so that it points to any existing PDF file.
  5. Now, open the terminal in the directory containing your source file and run the following commands:
    1. Compile
      javac -classpath $HOME/Accusoft/ImageGearJavaPDF1-64/java/IgPdf.jar MyFirstIGJavaPDFProject.java
      

      After running this command, you should see a file named MyFirstIGJavaPDFProject.class in your current directory.

    2. Build
      jar cfe MyFirstIGJavaPDFProject.jar MyFirstIGJavaPDFProject MyFirstIGJavaPDFProject.class
      

      After running this command, you should see a file named MyFirstIGJavaPDFProject.jar in your current directory.

    3. Set the environment variable (you only have to do this one time)
      export LD_PRELOAD=$HOME/Accusoft/ImageGearJavaPDF1-64/lib/libIGCORE18.so
      
    4. run
      java -classpath $HOME/Accusoft/ImageGearJavaPDF1-64/java/IgPdf.jar:. MyFirstIGJavaPDFProject
      

After running your sample, you should see a new PDF file, named sample_output.pdf, in your current directory.