Join us for an engaging webinar, as we unravel the potential of AI for revolutionizing document management.
Watch Now
Enable your employees to remain productive throughout the document management process.
Read More
Learn how SmartZone uses a regular expression engine integrated into the recognition engine to achieve the best possible accuracy on data that can be defined by a regular expression.
Docubee is an intelligent contact automation platform built to help your team success
I have an evaluation license for PrizmDoc. Can I evaluate MSO features with this evaluation license?
No, regular PrizmDoc evaluation licenses do not have MSO functionality. They will instead use LibreOffice to convert documents. Contact an Accusoft Support Technician or your Account Representative to discuss evaluating PrizmDoc with MSO enabled.
I am combining multiple PDF documents together, and I need to create a new bookmark collection, placed at the beginning of the new document. Each bookmark should go to a specific page or section of the new document. Example structure:
How might I do this using ImageGear .NET?
You are adding section dividers to the result document. So, for example, if you are to merge two documents, you might have, say, two sections, each with a single document, like so…
…The first page will be the first header page, and then the pages of Document 1, then another header page, then the pages of Document 2. So, the first header page is at index 0, the first page of Document 1 is at index 1, the second header is at 1 + firstDocumentPageCount, etc.
0
1
1 + firstDocumentPageCount
The following code demonstrates adding some blank pages to igResultDocument, inserting pages from other ImGearPDFDocuments, and modifying the bookmark tree such that it matches the outline above, with "Section X" pointing to the corresponding divider page and "Document X" pointing to the appropriate starting page number…
igResultDocument
ImGearPDFDocuments
// Create new document, add pages ImGearPDFDocument igResultDocument = new ImGearPDFDocument(); igResultDocument.CreateNewPage((int)ImGearPDFPageNumber.BEFORE_FIRST_PAGE, new ImGearPDFFixedRect(0, 0, 300, 300)); igResultDocument.InsertPages((int)ImGearPDFPageNumber.LAST_PAGE, igFirstDocument, 0, (int)ImGearPDFPageRange.ALL_PAGES, ImGearPDFInsertFlags.DEFAULT); igResultDocument.CreateNewPage(igFirstDocument.Pages.Count, new ImGearPDFFixedRect(0, 0, 300, 300)); igResultDocument.InsertPages((int)ImGearPDFPageNumber.LAST_PAGE, igSecondDocument, 0, (int)ImGearPDFPageRange.ALL_PAGES, ImGearPDFInsertFlags.DEFAULT); // Add first Section ImGearPDFBookmark resultBookmarkTree = igResultDocument.GetBookmark(); resultBookmarkTree.AddNewChild("Section 1"); var child = resultBookmarkTree.GetLastChild(); int targetPageNumber = 0; setNewDestination(igResultDocument, targetPageNumber, child); // Add first Document child.AddNewChild("Document 1"); child = child.GetLastChild(); targetPageNumber = 1; setNewDestination(igResultDocument, targetPageNumber, child); // Add second Section resultBookmarkTree.AddNewChild("Section 2"); child = resultBookmarkTree.GetLastChild(); targetPageNumber = 1 + igFirstDocument.Pages.Count; setNewDestination(igResultDocument, targetPageNumber, child); // Add second Document child.AddNewChild("Document 2"); child = child.GetLastChild(); targetPageNumber = 2 + igFirstDocument.Pages.Count; setNewDestination(igResultDocument, targetPageNumber, child); // Save using (FileStream stream = File.OpenWrite(@"C:\path\here\test.pdf")) { igResultDocument.Save(stream, ImGearSavingFormats.PDF, 0, 0, igResultDocument.Pages.Count, ImGearSavingModes.OVERWRITE); } ... private ImGearPDFDestination setNewDestination(ImGearPDFDocument igPdfDocument, int targetPageNumber, ImGearPDFBookmark targetNode) { ImGearPDFAction action = targetNode.GetAction(); if (action == null) { action = new ImGearPDFAction( igPdfDocument, new ImGearPDFDestination( igPdfDocument, igPdfDocument.Pages[targetPageNumber] as ImGearPDFPage, new ImGearPDFAtom("XYZ"), new ImGearPDFFixedRect(), 0, targetPageNumber)); targetNode.SetAction(action); } return action.GetDestination(); }
(The setNewDestination method is a custom method that abstracts the details of adding the new destination.)
setNewDestination
Essentially, the GetBookmark() method will allow you to get an instance representing the root of the bookmark tree, with its children being subtrees themselves. Thus, we can add a new child to an empty tree, then get the last child with GetLastChild(). Then, we can set the action for that node to be a new "GoTo" action that will navigate to the specified destination. Upon save to the file system, this should produce a PDF with the below bookmark structure…
GetBookmark()
GetLastChild()
"GoTo"
Note that you may need to use the native Save method (NOT SaveDocument) described in the product documentation here in order to save a PDF file with the bookmark tree included. Also, you can read more about Actions in the PDF Specification.
Save
SaveDocument
Using ScanFix Xpress (as illustrated in the ImageCleanUp sample) I can deskew an image, but the leftover blank space is filled with a user-specified pad color, which might clash horribly with the edges of the original image. Is it possible to automatically detect a matching pad color before executing a deskew operation?
ImageCleanUp
A simple approach would be to crop off the four edges of the image, specified perhaps by a percentage of width/height floor-bound by a minimum pixel count, then use the RGBColorCount method from ImagXpress on each edge to generate a histogram for each color channel, find the most frequent or average intensity (or some combination of the most frequent and the average), and then find the average intensity among all four edges. Then this resultant color could be used as the pad color for the image when it is deskewed.
RGBColorCount
For example, you can crop out portions of an image using the Crop method of the Processor class…
Crop
Processor
// Crop out the top edge of the image referred to by proc.Image Rectangle cropRectangle = new Rectangle(0, 0, inputImg.Width, verticalSliceSize); _processor.Crop(cropRectangle); return proc.Image;
We can do this for all four edges of the image. Then, for each edge, we can determine the frequencies at which each intensity occurs in the image’s pixel grid using the RGBColorCount Method…
int[] redHistogram, greenHistogram, blueHistogram; _processor.Image = edge; _processor.RGBColorCount(out redHistogram, out greenHistogram, out blueHistogram);
…now, redHistogram, greenHistogram, and blueHistogram will contain the frequencies of red, green, and blue intensities (0 to 255), respectively. We can use this data to extrapolate either the most frequent or the average intensity (or some combination of the two) in each channel. We can then construct RGB triplets representing the detected border color for that edge, and then average the values for each edge to get the appropriate overall pad color.
redHistogram
greenHistogram
blueHistogram
For example (using an average intensity)…
public int[] DetectEdgeAverageColor(ImageX edge) { int[] averageRGB = new int[] { 0, 0, 0 }; int[] redHistogram, greenHistogram, blueHistogram; _processor.Image = edge; _processor.RGBColorCount(out redHistogram, out greenHistogram, out blueHistogram); int numPixels = edge.Width * edge.Height; averageRGB[0] = findAverageIntensity(redHistogram, numPixels); averageRGB[1] = findAverageIntensity(greenHistogram, numPixels); averageRGB[2] = findAverageIntensity(blueHistogram, numPixels); return averageRGB; }
…
private int findAverageIntensity(int[] frequencies, int numPixels) { double averageIntesntity = 0; for (int intensityValue = 0; intensityValue < 256; intensityValue++) { int frequencyOfThisIntesity = frequencies[intensityValue]; averageIntesntity += (intensityValue * frequencyOfThisIntesity); } averageIntesntity /= numPixels; return (int)Math.Round(averageIntesntity); }
This should produce an RGB triplet representing a color similar to the edges of the image to be deskewed.
Is it possible to automatically annotate a document, similar to the Auto-Redaction feature, using PrizmDoc?
An auto-annotation feature isn’t an out-of-the-box feature but with some work, it can be done. This would involve creating a searchTask and using the information from it to programmatically create XML markup that can be used in the MarkupBurner.
To do this you would need to create a searchTask for the pattern you would like to annotate. You can then get the results of the searchTask as JSON which will contain all occurrences of that pattern/search. Each search result will include the selected text, the page on which it occurs, the starting index of the result, and the dimensions and coordinates of the bounding rectangles for that search result.
All this information can be used to construct the markup XML to add the annotations with the markup burner.
Once you have constructed the XML you would post to the MarkupBurner with the XML as the body to burn the document.
When licensing my PrizmDoc server, I get the error “Unable to write licensing information to the properties file.” Why is this happening?
To resolve this issue, please try the following:
Re-run the Prizm Licensing Utility as an administrator.
The Prizm Licensing Utility is writing to Prizm/prizm-services-config.yml. See whether you have permissions to edit this file.
Prizm/prizm-services-config.yml
Check whether Prizm/prizm-services-config.yml is locked by another process. If you have it open in some text editing software, PrizmDoc may not be able to write to it.
Additionally, if you have an OEM key, you can just manually enter this key into the file by placing the following at the top:
license.solutionName: ENTER_YOUR_SOLUTION_NAME_HERE license.key: 2.0…rest_of_the_key_goes_here
license.solutionName: ENTER_YOUR_SOLUTION_NAME_HERE
license.key: 2.0…rest_of_the_key_goes_here
For ImageGear .NET, what are the feature differences between an OCR Standard license, an OCR Plus license, and an OCR Asian license?
https://www.accusoft.com/products/imagegear-collection/imagegear-dot-net/#pricing
ImageGear’s OCR library has three different functionality options that you can choose for your website or application. The primary difference between the three options is the output formats created by the OCR engine. The options for your development are as follows:
OCR Standard: The standard edition creates output formats for Western languages such as English. The standard edition outputs text only files and generates a PDF. The file formats it includes are searchable text PDFs and text documents.
OCR Plus: The standard plus edition creates formatted outputs for Western languages like English. The formatted output is created with recognition technology that identifies font detail, locates image zones, and recognizes table structure in order to create a representation of the original document. The file formats it includes are Word, Excel, HTML, searchable PDF, and text documents.
OCR Asian: The Asian edition creates a formatted output for Asian languages like Chinese, Japanese, and Korean. This formatted output is created with the same recognition technology as the Standard Plus that identifies font detail, locates image zones, and recognizes table structure. It also creates a representation of the original file. Formats include Word, Excel, HTML, searchable PDF, and text documents.
I encounter an Unhandled Exception error, as shown below, in ImageGear when trying to load a page into the recognition engine.
Error Message: An unhandled exception of type ‘ImageGear.Core.ImGearException’ occurred in ImageGear22.Core.dll Additional information: IMG_DPI_WARN (0x4C711): Non-supported resolution. Value1:0x4C711
Error Message: An unhandled exception of type ‘ImageGear.Core.ImGearException’ occurred in ImageGear22.Core.dll
Additional information: IMG_DPI_WARN (0x4C711): Non-supported resolution. Value1:0x4C711
What is causing this and how can I fix it?
This is probably because the original image used to create the page didn’t have a Resolution Unit set.
To fix this, check if the page has a Resolution Unit set. If it does not, set it to inches. You should also set the DPI of the image as those values were probably not carried over from the original image since the Resolution Unit wasn’t set. The following code demonstrates how to do this.
// Open file and load page. using (var inStream = new FileStream(@"C:\Path\To\InputImage.jpg", FileMode.Open, FileAccess.Read, FileShare.Read)) { // Load first page. ImGearPage igPage = ImGearFileFormats.LoadPage(inStream, firstPage); if (igPage.DIB.ImageResolution.Units == ImGearResolutionUnits.NO_ABS) { igPage.DIB.ImageResolution.Units = ImGearResolutionUnits.INCHES; igPage.DIB.ImageResolution.XNumerator = 300; igPage.DIB.ImageResolution.XDenominator = 1; igPage.DIB.ImageResolution.YNumerator = 300; igPage.DIB.ImageResolution.YDenominator = 1; } using (var outStream = new FileStream(@"C:\Path\To\OutputImage.jpg", FileMode.OpenOrCreate, FileAccess.ReadWrite)) { // Import the page into the recognition engine. using (ImGearRecPage recognitionPage = recognitionEngine.ImportPage((ImGearRasterPage)igPage)) { // Preprocess the page. recognitionPage.Image.Preprocess(); // Perform recognition. recognitionPage.Recognize(); // Write the page to the output file. recognitionEngine.OutputManager.DirectTextFormat = ImGearRecDirectTextFormat.SimpleText; recognitionEngine.OutputManager.WriteDirectText(recognitionPage, outStream); } } }
When using OCR in ImageGear .NET, is there any way to distinguish between a capital/uppercase letter O and the number 0?
Not without context or a font that makes the difference clear (such as one with a slashed 0). ImageGear will properly recognize Oliver and 1530 as containing O and 0, respectively, but cannot reliably distinguish it when letters and numbers are mixed. That is, ImageGear may not reliably distinguish between 1ABO0F3 and 1AB0OF3.
I am creating a viewing session from a local document on my server and providing an absolute path “C:\Users\Public\Documents\Accusoft\Common\Images\PdfDemoSample.pdf” as the fileName but I receive a 404 error. What could be the reason for this and how can I fix it?
fileName
For security reasons, PAS disallows providing absolute paths for documents that are outside of the directory specified in the documents.path in the pcc.win.yml config file. So trying to provide a path to any file outside of that directory will cause a 404 error.
documents.path
We recommend that you set documents.path to the directory in which you store your documents. When you create a create a viewing session using a local document, you should set fileName to the relative path to the document from the documents directory.
You can also set fileName to the absolute path to the document if it is contained in the documents directory (specified in pcc.win.yml) if you prefer to use absolute paths.
After applying a new license/evaluation license through the license utility on Linux, the following error appears in the logs:
{"gid":"","name":"OCS","time":"2019-01-3T18:26:39.368Z","pid":36875,"level":50,"tid":36875,"taskid":8,"FATAL ERROR":"MSO feature is active, but 'fidelity.msOfficeCluster.host' and 'fidelity.msOfficeCluster.port' are not configured, going to 'Unhealthy' state"}
What could cause this issue to occur, and how can it be fixed?
As you are running on Linux, the MSO switch on the license assumes that there are additional settings configured:
fidelity.msOfficeCluster.host and fidelity.msOfficeCluster.port
These settings are meant to point to a Windows server which has Microsoft Office 2013 or 2016 installed alongside PrizmDoc with MSO enabled. This is required for MSO functionality to be enabled.
If you wish to use the license with MSO enabled but do not have a separate Windows server, you can do the following to set the PrizmDoc service to run using LibreOffice:
/usr/share/prizm/prizm-services-config.yml
fidelity.msOfficeDocumentsRenderer: auto
fidelity.msOfficeDocumentsRenderer: libreoffice
Some of our users using Google Chrome have been reporting that PDF document loading and page rendering is extraordinarily slow. This is making the workflow unusable. What could have caused this issue to start occurring?
An issue was discovered in Google Chrome 71 that was causing this issue. The issue was resolved in Google Chrome 72 (released in Jan 2019).
If you are experiencing this PDF loading issue with PrizmDoc, and you are using the Google Chrome browser, please verify that you are using the latest stable version here: https://www.google.com/chrome/
How can I determine what version of PrizmDoc Viewer my server is running?
To check the server version, make a GET request to:
http://localhost:18681/PCCIS/V1/Service/Current/Info
You can make a get request by navigating to the URL in your browser. The JSON response will have a "pccisVersion" property, which is the version number you are looking for. A similar GET to the following URL will determine the PAS version:
"pccisVersion"
http://localhost:3000/info
The JSON response’s "version" property is what you are looking for. Keep in mind that differing version numbers don’t necessarily indicate a mismatch, as long as the major and minor version numbers sync-up. For example, the PCCIS version 13.5.33.5696 and PAS version 13.5.0000.1816 are from the same release (13.5).
"version"