Technical FAQs

Question

I have an evaluation license for PrizmDoc. Can I evaluate MSO features with this evaluation license?

Answer

No, regular PrizmDoc evaluation licenses do not have MSO functionality. They will instead use LibreOffice to convert documents. Contact an Accusoft Support Technician or your Account Representative to discuss evaluating PrizmDoc with MSO enabled.

Question

I am combining multiple PDF documents together, and I need to create a new bookmark collection, placed at the beginning of the new document. Each bookmark should go to a specific page or section of the new document.
Example structure:

  • Section 1
    • Document 1
  • Section 2
    • Document 2

How might I do this using ImageGear .NET?

Answer

You are adding section dividers to the result document. So, for example, if you are to merge two documents, you might have, say, two sections, each with a single document, like so…

  • Section 1
    • Document 1
  • Section 2
    • Document 2

…The first page will be the first header page, and then the pages of Document 1, then another header page, then the pages of Document 2. So, the first header page is at index 0, the first page of Document 1 is at index 1, the second header is at 1 + firstDocumentPageCount, etc.

The following code demonstrates adding some blank pages to igResultDocument, inserting pages from other ImGearPDFDocuments, and modifying the bookmark tree such that it matches the outline above, with "Section X" pointing to the corresponding divider page and "Document X" pointing to the appropriate starting page number…

// Create new document, add pages
ImGearPDFDocument igResultDocument = new ImGearPDFDocument();
igResultDocument.CreateNewPage((int)ImGearPDFPageNumber.BEFORE_FIRST_PAGE, new ImGearPDFFixedRect(0, 0, 300, 300));
igResultDocument.InsertPages((int)ImGearPDFPageNumber.LAST_PAGE, igFirstDocument, 0, (int)ImGearPDFPageRange.ALL_PAGES, ImGearPDFInsertFlags.DEFAULT);
igResultDocument.CreateNewPage(igFirstDocument.Pages.Count, new ImGearPDFFixedRect(0, 0, 300, 300));
igResultDocument.InsertPages((int)ImGearPDFPageNumber.LAST_PAGE, igSecondDocument, 0, (int)ImGearPDFPageRange.ALL_PAGES, ImGearPDFInsertFlags.DEFAULT);

// Add first Section
ImGearPDFBookmark resultBookmarkTree = igResultDocument.GetBookmark();
resultBookmarkTree.AddNewChild("Section 1");
var child = resultBookmarkTree.GetLastChild();
int targetPageNumber = 0;
setNewDestination(igResultDocument, targetPageNumber, child);

// Add first Document
child.AddNewChild("Document 1");
child = child.GetLastChild();
targetPageNumber = 1;
setNewDestination(igResultDocument, targetPageNumber, child);

// Add second Section
resultBookmarkTree.AddNewChild("Section 2");
child = resultBookmarkTree.GetLastChild();
targetPageNumber = 1 + igFirstDocument.Pages.Count;
setNewDestination(igResultDocument, targetPageNumber, child);

// Add second Document
child.AddNewChild("Document 2");
child = child.GetLastChild();
targetPageNumber = 2 + igFirstDocument.Pages.Count;
setNewDestination(igResultDocument, targetPageNumber, child);

// Save
using (FileStream stream = File.OpenWrite(@"C:\path\here\test.pdf"))
{
    igResultDocument.Save(stream, ImGearSavingFormats.PDF, 0, 0, igResultDocument.Pages.Count, ImGearSavingModes.OVERWRITE);
}

...

private ImGearPDFDestination setNewDestination(ImGearPDFDocument igPdfDocument, int targetPageNumber, ImGearPDFBookmark targetNode)
{
    ImGearPDFAction action = targetNode.GetAction();
    if (action == null)
    {
        action = new ImGearPDFAction(
            igPdfDocument,
            new ImGearPDFDestination(
                igPdfDocument,
                igPdfDocument.Pages[targetPageNumber] as ImGearPDFPage,
                new ImGearPDFAtom("XYZ"),
                new ImGearPDFFixedRect(), 0, targetPageNumber));
        targetNode.SetAction(action);
    }
    return action.GetDestination();
}

(The setNewDestination method is a custom method that abstracts the details of adding the new destination.)

Essentially, the GetBookmark() method will allow you to get an instance representing the root of the bookmark tree, with its children being subtrees themselves. Thus, we can add a new child to an empty tree, then get the last child with GetLastChild(). Then, we can set the action for that node to be a new "GoTo" action that will navigate to the specified destination. Upon save to the file system, this should produce a PDF with the below bookmark structure…

Bookmarks example

Note that you may need to use the native Save method (NOT SaveDocument) described in the product documentation here in order to save a PDF file with the bookmark tree included. Also, you can read more about Actions in the PDF Specification.

Question

Using ScanFix Xpress (as illustrated in the ImageCleanUp sample) I can deskew an image, but the leftover blank space is filled with a user-specified pad color, which might clash horribly with the edges of the original image. Is it possible to automatically detect a matching pad color before executing a deskew operation?

Answer

A simple approach would be to crop off the four edges of the image, specified perhaps by a percentage of width/height floor-bound by a minimum pixel count, then use the RGBColorCount method from ImagXpress on each edge to generate a histogram for each color channel, find the most frequent or average intensity (or some combination of the most frequent and the average), and then find the average intensity among all four edges. Then this resultant color could be used as the pad color for the image when it is deskewed.

For example, you can crop out portions of an image using the Crop method of the Processor class…

// Crop out the top edge of the image referred to by proc.Image
Rectangle cropRectangle = new Rectangle(0, 0, inputImg.Width, verticalSliceSize);
_processor.Crop(cropRectangle);
return proc.Image;

We can do this for all four edges of the image. Then, for each edge, we can determine the frequencies at which each intensity occurs in the image’s pixel grid using the RGBColorCount Method…

int[] redHistogram, greenHistogram, blueHistogram;
_processor.Image = edge;
_processor.RGBColorCount(out redHistogram, out greenHistogram, out blueHistogram);

…now, redHistogram, greenHistogram, and blueHistogram will contain the frequencies of red, green, and blue intensities (0 to 255), respectively. We can use this data to extrapolate either the most frequent or the average intensity (or some combination of the two) in each channel. We can then construct RGB triplets representing the detected border color for that edge, and then average the values for each edge to get the appropriate overall pad color. 

For example (using an average intensity)…

public int[] DetectEdgeAverageColor(ImageX edge)
{
    int[] averageRGB = new int[] { 0, 0, 0 };
    int[] redHistogram, greenHistogram, blueHistogram;
    _processor.Image = edge;
    _processor.RGBColorCount(out redHistogram, out greenHistogram, out blueHistogram);

    int numPixels = edge.Width * edge.Height;
    averageRGB[0] = findAverageIntensity(redHistogram, numPixels);
    averageRGB[1] = findAverageIntensity(greenHistogram, numPixels);
    averageRGB[2] = findAverageIntensity(blueHistogram, numPixels);
    

    return averageRGB;
}

private int findAverageIntensity(int[] frequencies, int numPixels)
{
    double averageIntesntity = 0;
    for (int intensityValue = 0; intensityValue < 256; intensityValue++)
    {
        int frequencyOfThisIntesity = frequencies[intensityValue];
        averageIntesntity += (intensityValue * frequencyOfThisIntesity);
    }
    averageIntesntity /= numPixels;
    return (int)Math.Round(averageIntesntity);
}

This should produce an RGB triplet representing a color similar to the edges of the image to be deskewed.

Question

Is it possible to automatically annotate a document, similar to the Auto-Redaction feature, using PrizmDoc?

Answer

An auto-annotation feature isn’t an out-of-the-box feature but with some work, it can be done. This would involve creating a searchTask and using the information from it to programmatically create XML markup that can be used in the MarkupBurner.

To do this you would need to create a searchTask for the pattern you would like to annotate. You can then get the results of the searchTask as JSON which will contain all occurrences of that pattern/search. Each search result will include the selected text, the page on which it occurs, the starting index of the result, and the dimensions and coordinates of the bounding rectangles for that search result.

All this information can be used to construct the markup XML to add the annotations with the markup burner.

Once you have constructed the XML you would post to the MarkupBurner with the XML as the body to burn the document.

Question

When licensing my PrizmDoc server, I get the error “Unable to write licensing information to the properties file.” Why is this happening?

enter image description here

Answer

To resolve this issue, please try the following:

  1. Re-run the Prizm Licensing Utility as an administrator.

  2. The Prizm Licensing Utility is writing to Prizm/prizm-services-config.yml. See whether you have permissions to edit this file.

  3. Check whether Prizm/prizm-services-config.yml is locked by another process. If you have it open in some text editing software, PrizmDoc may not be able to write to it.

Additionally, if you have an OEM key, you can just manually enter this key into the file by placing the following at the top:

license.solutionName: ENTER_YOUR_SOLUTION_NAME_HERE

license.key: 2.0…rest_of_the_key_goes_here

Question

For ImageGear .NET, what are the feature differences between an OCR Standard license, an OCR Plus license, and an OCR Asian license?

https://www.accusoft.com/products/imagegear-collection/imagegear-dot-net/#pricing

Answer

ImageGear’s OCR library has three different functionality options that you can choose for your website or application. The primary difference between the three options is the output formats created by the OCR engine. The options for your development are as follows:

  1. OCR Standard:
    The standard edition creates output formats for Western languages such as English. The standard edition outputs text only files and generates a PDF. The file formats it includes are searchable text PDFs and text documents.

  2. OCR Plus:
    The standard plus edition creates formatted outputs for Western languages like English. The formatted output is created with recognition technology that identifies font detail, locates image zones, and recognizes table structure in order to create a representation of the original document. The file formats it includes are Word, Excel, HTML, searchable PDF, and text documents.

  3. OCR Asian:
    The Asian edition creates a formatted output for Asian languages like Chinese, Japanese, and Korean. This formatted output is created with the same recognition technology as the Standard Plus that identifies font detail, locates image zones, and recognizes table structure. It also creates a representation of the original file. Formats include Word, Excel, HTML, searchable PDF, and text documents.

Question

I encounter an Unhandled Exception error, as shown below, in ImageGear when trying to load a page into the recognition engine.

Error Message: An unhandled exception of type
‘ImageGear.Core.ImGearException’ occurred in ImageGear22.Core.dll

Additional information: IMG_DPI_WARN (0x4C711): Non-supported
resolution. Value1:0x4C711

What is causing this and how can I fix it?

Answer

This is probably because the original image used to create the page didn’t have a Resolution Unit set.

Resolution unit not set in original image

To fix this, check if the page has a Resolution Unit set. If it does not, set it to inches. You should also set the DPI of the image as those values were probably not carried over from the original image since the Resolution Unit wasn’t set. The following code demonstrates how to do this.

// Open file and load page.
using (var inStream = new FileStream(@"C:\Path\To\InputImage.jpg", FileMode.Open, FileAccess.Read, FileShare.Read))
{
    // Load first page.
    ImGearPage igPage = ImGearFileFormats.LoadPage(inStream, firstPage);

    if (igPage.DIB.ImageResolution.Units == ImGearResolutionUnits.NO_ABS)
    {
        igPage.DIB.ImageResolution.Units = ImGearResolutionUnits.INCHES;
        igPage.DIB.ImageResolution.XNumerator = 300;
        igPage.DIB.ImageResolution.XDenominator = 1;
        igPage.DIB.ImageResolution.YNumerator = 300;
        igPage.DIB.ImageResolution.YDenominator = 1;
    }

    using (var outStream = new FileStream(@"C:\Path\To\OutputImage.jpg", FileMode.OpenOrCreate, FileAccess.ReadWrite))
    {
        // Import the page into the recognition engine.
        using (ImGearRecPage recognitionPage = recognitionEngine.ImportPage((ImGearRasterPage)igPage))
        {
            // Preprocess the page.
            recognitionPage.Image.Preprocess();

            // Perform recognition.
            recognitionPage.Recognize();

            // Write the page to the output file.
            recognitionEngine.OutputManager.DirectTextFormat = ImGearRecDirectTextFormat.SimpleText;
            recognitionEngine.OutputManager.WriteDirectText(recognitionPage, outStream);
        }
    }
}
Question

When using OCR in ImageGear .NET, is there any way to distinguish between a capital/uppercase letter O and the number 0?

Answer

Not without context or a font that makes the difference clear (such as one with a slashed 0). ImageGear will properly recognize Oliver and 1530 as containing O and 0, respectively, but cannot reliably distinguish it when letters and numbers are mixed. That is, ImageGear may not reliably distinguish between 1ABO0F3 and 1AB0OF3.

Question

I am creating a viewing session from a local document on my server and providing an absolute path “C:\Users\Public\Documents\Accusoft\Common\Images\PdfDemoSample.pdf” as the fileName but I receive a 404 error. What could be the reason for this and how can I fix it?

Answer

For security reasons, PAS disallows providing absolute paths for documents that are outside of the directory specified in the documents.path in the pcc.win.yml config file. So trying to provide a path to any file outside of that directory will cause a 404 error.

We recommend that you set documents.path to the directory in which you store your documents. When you create a create a viewing session using a local document, you should set fileName to the relative path to the document from the documents directory.

You can also set fileName to the absolute path to the document if it is contained in the documents directory (specified in pcc.win.yml) if you prefer to use absolute paths.

Question

After applying a new license/evaluation license through the license utility on Linux, the following error appears in the logs:

{"gid":"","name":"OCS","time":"2019-01-3T18:26:39.368Z","pid":36875,"level":50,"tid":36875,"taskid":8,"FATAL ERROR":"MSO feature is active, but 'fidelity.msOfficeCluster.host' and 'fidelity.msOfficeCluster.port' are not configured, going to 'Unhealthy' state"}

What could cause this issue to occur, and how can it be fixed?

Answer

As you are running on Linux, the MSO switch on the license assumes that there are additional settings configured:

fidelity.msOfficeCluster.host and fidelity.msOfficeCluster.port

These settings are meant to point to a Windows server which has Microsoft Office 2013 or 2016 installed alongside PrizmDoc with MSO enabled. This is required for MSO functionality to be enabled.

If you wish to use the license with MSO enabled but do not have a separate Windows server, you can do the following to set the PrizmDoc service to run using LibreOffice:

  1. Make a backup of /usr/share/prizm/prizm-services-config.yml file.
  2. Edit the file in the text editor of your choice and find the following line, fidelity.msOfficeDocumentsRenderer: auto
  3. Be sure to remove the hash and leading space in front of the line and then change from auto to libreoffice.
    fidelity.msOfficeDocumentsRenderer: libreoffice
  4. Restart the service by running /usr/share/prizm/scripts/pccis.sh restart
Question

Some of our users using Google Chrome have been reporting that PDF document loading and page rendering is extraordinarily slow. This is making the workflow unusable. What could have caused this issue to start occurring?

Answer

An issue was discovered in Google Chrome 71 that was causing this issue. The issue was resolved in Google Chrome 72 (released in Jan 2019).

If you are experiencing this PDF loading issue with PrizmDoc, and you are using the Google Chrome browser, please verify that you are using the latest stable version here:
https://www.google.com/chrome/

Question

How can I determine what version of PrizmDoc Viewer my server is running?

Answer

To check the server version, make a GET request to:

http://localhost:18681/PCCIS/V1/Service/Current/Info

You can make a get request by navigating to the URL in your browser. The JSON response will have a "pccisVersion" property, which is the version number you are looking for. A similar GET to the following URL will determine the PAS version:

http://localhost:3000/info

The JSON response’s "version" property is what you are looking for. Keep in mind that differing version numbers don’t necessarily indicate a mismatch, as long as the major and minor version numbers sync-up. For example, the PCCIS version 13.5.33.5696 and PAS version 13.5.0000.1816 are from the same release (13.5).