Featured Content
eGuide: Digital Transformation
Enable your employees to remain productive throughout the document management process.
Read More
SDK Technologies
Accusoft SDKs
Improve Form Processing Text Recognition Results with Regular Expressions
Learn how SmartZone uses a regular expression engine integrated into the recognition engine to achieve the best possible accuracy on data that can be defined by a regular expression.
Experience Docubee
Meet Docubee
Docubee is an intelligent contract automation platform built to help your team success
After searching a document, an error icon appears in the search results panel. Clicking on it displays the following error message: “x page(s) cannot be searched.” Why does this occur and how can I find out which specific pages couldn’t be searched?
When the PrizmDoc Viewer text-service cannot find any text for a given page in the document, it provides an array of all the pages without text in the response from searchTask results.
searchTask
In short, the document is fine and simply contains pages without text. If you look at the pagesWithoutText array contained within the response data from searchTasks, you’ll see something like this:
pagesWithoutText
searchTasks
[0, 1, 7, 17, 43, 45, 65, 67, 77, 79,…]
The values reported are pages that do not contain any text but instead are either blank or contain an image. This data can then be used to inform the user of how many pages are not searchable.
In PrizmDoc, my document appears to be small on the page relative to the viewer. How can I fix this?
By default, PrizmDoc renders a PDF file according to the MediaBox, which is normally the same as CropBox, though sometimes this is not the case. The larger area you see in the PrizmDoc Viewer is the size of the MediaBox. Please note that the product provides the fileTypes.pdf.pageBoundaries control option (or useCropBox in the older versions) to change the default behavior. Try setting the option to cropBox in the Central Configuration File in order to get the PDF content rendered according to the CropBox. You can read more about configuring image frame rendering in our documentation here.
fileTypes.pdf.pageBoundaries
useCropBox
cropBox
For additional reading, see 7.7.3.3 on “User Space” of Adobe’s PDF 1.7 specification:
https://www.adobe.com/content/dam/acom/en/devnet/pdf/pdfs/PDF32000_2008.pdf
Note: In some older versions of PrizmDoc, there exists an issue where setting the pageBoundaries field to cropBox can cause light blurring/distorting on the page. This issue was addressed in version 13.4.
pageBoundaries
How can I get a document’s dimensions with PrizmDoc?
There are two methods you can use to do this with PrizmDoc:
The first method is using the requestPageAttributes() method from ViewerControl. This method allows you to get the width and height of a page in the document in pixels. Below is sample code on how to use requestPageAttributes() to get the attributes of page 1 of a document:
requestPageAttributes()
viewerControl.requestPageAttributes(1).then(function(attributes) { var pageWidth = attributes.width; var pageHeight = attributes.height; });
The second method is done by making a GET request to the PrizmDoc server to get metadata for a page of the source document in a viewing session. The request is:
GET /PCCIS/V1/Page/q/{{PageNumber}}/Attributes?DocumentID=u{{viewingSessionId}}&ContentType={{ContentType}}
The content type needs to be set to “png” for raster content and “svgb” for SVG content. The request returns the data in a JSON object containing the image’s width and height. The units for the width and height are in pixels when the contentType is set to “png” and unspecified units when the content type is set to “svgb”.
The request also returns the horizontal and vertical resolution of raster content when the content type is set to “png”. This information is similar to pixels per inch, but the units are unspecified, so if you wanted to calculate the size of the document you can calculate it by width divided by horizontal resolution or height divided by vertical resolution. The resolution is hard-coded to 90 when contentType is set to “svgb”.
contentType