Technical FAQs for "ImageGear.NET"

Question

My service is crashing in IIS, and I see an error saying w3wp.exe has crashed in my Event Viewer. Why is my service crashing?

Answer

Normally, when there is an error in the Event Viewer saying that w3wp.exe has crashed, this means that your application pool has crashed. So, by extension, anything running on IIS will stop working. There are many reasons why this could happen, so to find out the root cause, there are a few things that you can do:

  1. Look at the dump file generated by the Event Viewer
    that is tied to this specific error.
  2. Run this file through WinDbg to get a stack trace of the
    error.

These two things should give you the information you need to determine the root cause of the problem. Below is an online article explaining in more detail how to get these two things and possible causes of the crash.

http://blog.whitesites.com/Debugging-Faulting-Application-w3wp-exe-Crashes__634424707278896484_blog.htm

Question

How do I ensure temp files are deleted when closing ImageGear .NET?

Answer

All PDF objects are based on underlying low-level PDF objects that are not controlled by .NET resource manager and garbage collector. Because of this, each PDF object that is created from scratch should be explicitly disposed of using that object’s Dispose() method.

Also, any ImGearPDEContent object obtained from ImGearPDFPage should be released using the ImGearPDFPage.ReleaseContent() in all cases.

This should cause all temp files to be cleared when the application is closed.

Question

How do I use a Network Drive path for Image and ART storage in my ImageGear .NET web application?

Answer

In an ImageGear .NET web application, you have to define the location of the images and annotations directory in the storageRootPath and artStorageRootPath configuration property.
In the current version of ImageGear .NET, the storageRootPath and artStorageRootPath do not work with a network drive path \\SERVER-NAME\sharefilename.

The workaround for this would be to create a Symbolic link from a local directory to the network drive directory.

  • To create a symbolic link: Open “Command Prompt” as Administrator and type in > mklink /d "local path" \\SERVER-NAME\sharefilename
  • Pass in the path of the symbolic link as image or art storage root path in your web.config: storageRootPath="local path" artStorageRootPath="local path"
Question

Why do I get a “File Format Unrecognized” exception when trying to load a PDF document in ImageGear .NET?

Answer

You will need to set up your project to include PDF support if you want to work with PDF documents. Add a reference to ImageGear24.Formats.Pdf (if you’re using another version of ImageGear, make sure you’re adding the correct reference). Add the following line of code where you specify other resources:

using ImageGear.Formats.PDF;

Add the following lines of code before you begin working with PDFs:

ImGearFileFormats.Filters.Insert(0, ImGearPDF.CreatePDFFormat());
ImGearPDF.Initialize();

The documentation page linked here shows how to add PDF support to a project.

Question

How do I remove XMP Data from my image using ImageGear .NET?

Answer

When removing XMP data in ImageGear, the simplest way to do this is to set the XMP Metadata node to null, like so:

ImGearSimplifiedMetadata.Initialize(); 
doc.Metadata.XMP = new ImGearXMPMetadataRoot();

Or, you can traverse through the metadata tree and remove each node from the tree:

// Example code. Not thoroughly tested
private static void RemoveXmp(ImGearMetadataTree tree)
{
ArrayList toRemove = new ArrayList();
foreach (ImGearMetadataNode node in tree.Children)
{
    if (node is ImGearMetadataTree)
        RemoveXmp((ImGearMetadataTree)node);

    if (node.Format != ImGearMetadataFormats.XMP)
        continue;

    toRemove.Add(node);
}

foreach (ImGearMetadataNode node in toRemove)
    tree.Children.Remove(node);
}
Question

We are saving files to the PDF/A standard and are running into a few cases where the file cannot be saved as PDF/A by ImageGear .NET. Why is this, and how do we do it properly?

Answer

First, determine whether a PDF document can be converted to PDF/A by creating an ImGearPDFPreflight object from your document, and generating an ImGearPDFPreflightReport object from it:

using (ImGearPDFPreflight preflight = new ImGearPDFPreflight((ImGearPDFDocument)igDocument))
{
    report = preflight.VerifyCompliance(ImGearPDFPreflightProfile.PDFA_1A_2005, 0, -1);
}

The first argument of the VerifyCompliance() method is the standard of PDF/A you want to use. ImageGear .NET is currently able to convert documents to adhere to the PDF/A-1A and PDF/A-1B standards:

PDF/A-1 Standard

ImageGear and PDF/A

There are parts of the PDF/A-2 and PDF/A-3 standards which may allow for more documents to be converted, but ImageGear .NET currently does not support those. This could possibly be why your document cannot be converted in ImageGear .NET.

Once the report is generated, you can access its Status, which will tell you if the document is fixable. You can also access its Code which will let you know if it’s a fixed page or if it has issues; it will return Success if fixed, or some error code otherwise. You can check these conditions to determine whether it’s worth attempting to convert the document:

// If the document is not already PDFA-1a compliant but can be converted
if ((report.Code == ImGearPDFPreflightReportCodes.SUCCESS) ||
(report.Status == ImGearPDFPreflightStatusCode.Fixable))
{
    ImGearPDFPreflightConvertOptions pdfaOptions = new ImGearPDFPreflightConvertOptions(ImGearPDFPreflightProfile.PDFA_1A_2005, 0, -1);
    ImGearPDFPreflight preflight = new ImGearPDFPreflight((ImGearPDFDocument)igDocument);
    preflight.Convert(pdfaOptions);
    saveFile(outputPath, igDocument);
}

// Create error message if document was not converted.
else if (report.Status != ImGearPDFPreflightStatusCode.Fixed)
{
    printAllRecordDescriptions(report);
    throw new ApplicationException("Given PDF document cannot be converted to PDFA-1a standard.");
}

If you want more information on why a document may not be convertible, you can access the preflight report for its records and codes. A preflight’s "Records" member is a recursive list of preflight reports. A preflight report will have a list of reports under Records, and each of those reports may have more reports, etc. You can recursively loop through them as seen below to output every reason a document is not convertible:

    private static void printAllRecordDescriptions(StreamWriter file, ImGearPDFPreflightReport report)
    {
        foreach (ImGearPDFPreflightReport rep in report.Records)
        {
            file.WriteLine(rep.Description);
            file.WriteLine(rep.Code.ToString() + "\r\n");
            printAllRecordDescriptions(file, rep);
        }
    }

Ultimately, the failure of a document to convert to PDF/A is non-deterministic. While some compliance failures can be corrected, in combination they may not be correctable. Therefore, the unfortunate answer is that to determine if it can be converted, conversion must be attempted.

Question

In ImageGear, why am I running into AccessViolationExceptions when I run my application in parallel?

Answer

This issue can sometimes occur if ImGearPDF is being initialized earlier in the application. In order to use ImGearPDF in a multi-threaded program, it needs to be initialized on a per-thread basis. For example, if you have something like this:

ImGearPDF.Initialize();
Parallel.For(...)
{ 
    // OCR code
}
ImGearPDF.Terminate();

Change it to this:

Parallel.For(...)
{
    ImGearPDF.Initialize();
    // OCR code
    ImGearPDF.Terminate();
}

The same logic applies to other ImageGear classes, such as ImGearPage instances or the ImGearRecognition class – you should create one instance of each class per thread, rather than creating a single instance and accessing it across threads. In the case of the ImGearRecognition class, you’ll have to use the createUnique parameter to make that possible e.g.:

ImGearRecognition recEngine = ImGearRecognition(true);

instead of

ImGearRecognition recEngine = ImGearRecognition();
Question

I want to re-arrange the page order of a PDF. I’ve tried the following…

var page = imGearDocument.Pages[indx].Clone();

imGearDocument.Pages.RemoveAt(indx); //// Exception: "One or more pages are in use and could not be deleted."

imGearDocument.Pages.Insert(newIndx, page);

But an exception is thrown. Somehow, even though the page was cloned, the exception states that the page can’t be removed because it’s still in use.

What am I doing wrong here?

Answer

If you’re using an older version of ImageGear .NET, you may run into this exception when you clone the page. Some of the resources between the original and the clone are still shared, which is why this happens.

Starting with ImageGear .NET v24.8, this no longer happens, and the above code should work fine.

If you still need to use the earlier version, you can use the InsertPages method instead.

Question

ImageGear .NET v24.6 added support for viewing PDF documents with XFA content. I’m using v24.8, and upon trying to open an XFA PDF, I get a SEHException for some reason…

SEHException

Why might this be happening?

Answer

One reason could be because you need to execute the following lines after initializing the PDF component, and prior to loading an XFA PDF:

// Allow opening of PDF documents that contain XFA form data.
IImGearFormat pdfFormat = ImGearFileFormats.Filters.Get(ImGearFormats.PDF);
pdfFormat.Parameters.GetByName("XFAAllowed").Value = true;

This will enable XFA PDFs to be opened by the ImageGear .NET toolkit.

Question

For ImageGear .NET, what are the feature differences between an OCR Standard license, an OCR Plus license, and an OCR Asian license?

https://www.accusoft.com/products/imagegear-collection/imagegear-dot-net/#pricing

Answer

ImageGear’s OCR library has three different functionality options that you can choose for your website or application. The primary difference between the three options is the output formats created by the OCR engine. The options for your development are as follows:

  1. OCR Standard:
    The standard edition creates output formats for Western languages such as English. The standard edition outputs text only files and generates a PDF. The file formats it includes are searchable text PDFs and text documents.

  2. OCR Plus:
    The standard plus edition creates formatted outputs for Western languages like English. The formatted output is created with recognition technology that identifies font detail, locates image zones, and recognizes table structure in order to create a representation of the original document. The file formats it includes are Word, Excel, HTML, searchable PDF, and text documents.

  3. OCR Asian:
    The Asian edition creates a formatted output for Asian languages like Chinese, Japanese, and Korean. This formatted output is created with the same recognition technology as the Standard Plus that identifies font detail, locates image zones, and recognizes table structure. It also creates a representation of the original file. Formats include Word, Excel, HTML, searchable PDF, and text documents.

Question

I encounter an Unhandled Exception error, as shown below, in ImageGear when trying to load a page into the recognition engine.

Error Message: An unhandled exception of type
‘ImageGear.Core.ImGearException’ occurred in ImageGear22.Core.dll

Additional information: IMG_DPI_WARN (0x4C711): Non-supported
resolution. Value1:0x4C711

What is causing this and how can I fix it?

Answer

This is probably because the original image used to create the page didn’t have a Resolution Unit set.

Resolution unit not set in original image

To fix this, check if the page has a Resolution Unit set. If it does not, set it to inches. You should also set the DPI of the image as those values were probably not carried over from the original image since the Resolution Unit wasn’t set. The following code demonstrates how to do this.

// Open file and load page.
using (var inStream = new FileStream(@"C:\Path\To\InputImage.jpg", FileMode.Open, FileAccess.Read, FileShare.Read))
{
    // Load first page.
    ImGearPage igPage = ImGearFileFormats.LoadPage(inStream, firstPage);

    if (igPage.DIB.ImageResolution.Units == ImGearResolutionUnits.NO_ABS)
    {
        igPage.DIB.ImageResolution.Units = ImGearResolutionUnits.INCHES;
        igPage.DIB.ImageResolution.XNumerator = 300;
        igPage.DIB.ImageResolution.XDenominator = 1;
        igPage.DIB.ImageResolution.YNumerator = 300;
        igPage.DIB.ImageResolution.YDenominator = 1;
    }

    using (var outStream = new FileStream(@"C:\Path\To\OutputImage.jpg", FileMode.OpenOrCreate, FileAccess.ReadWrite))
    {
        // Import the page into the recognition engine.
        using (ImGearRecPage recognitionPage = recognitionEngine.ImportPage((ImGearRasterPage)igPage))
        {
            // Preprocess the page.
            recognitionPage.Image.Preprocess();

            // Perform recognition.
            recognitionPage.Recognize();

            // Write the page to the output file.
            recognitionEngine.OutputManager.DirectTextFormat = ImGearRecDirectTextFormat.SimpleText;
            recognitionEngine.OutputManager.WriteDirectText(recognitionPage, outStream);
        }
    }
}
Question

When using OCR in ImageGear .NET, is there any way to distinguish between a capital/uppercase letter O and the number 0?

Answer

Not without context or a font that makes the difference clear (such as one with a slashed 0). ImageGear will properly recognize Oliver and 1530 as containing O and 0, respectively, but cannot reliably distinguish it when letters and numbers are mixed. That is, ImageGear may not reliably distinguish between 1ABO0F3 and 1AB0OF3.