Technical FAQs

Question

Why do I get a “File Format Unrecognized” exception when trying to load a PDF document in ImageGear .NET?

Answer

You will need to set up your project to include PDF support if you want to work with PDF documents. Add a reference to ImageGear24.Formats.Pdf (if you’re using another version of ImageGear, make sure you’re adding the correct reference). Add the following line of code where you specify other resources:

using ImageGear.Formats.PDF;

Add the following lines of code before you begin working with PDFs:

ImGearFileFormats.Filters.Insert(0, ImGearPDF.CreatePDFFormat());
ImGearPDF.Initialize();

The documentation page linked here shows how to add PDF support to a project.

Question

My document has Asian characters (CJK, etc.), which are not displaying correctly in PrizmDoc Viewer; what steps can I take to view them?

Answer

In some cases, the reason is due to the fonts not being installed on the operating system. We have outlined some commands to install fonts for select operating systems below:

In CentOS 6 do:

yum groupinstall "Chinese Support"
yum groupinstall "Japanese Support"
yum groupinstall "Korean Support"
yum groupinstall "Kannada Support"
yum groupinstall "Hindi Support"

In CentOS 7 do:

yum groupinstall "fonts"

In Ubuntu do:

sudo apt-get install language-pack-ja
sudo apt-get install japan*
sudo apt-get install language-pack-zh*
sudo apt-get install chinese*
sudo apt-get install language-pack-ko
sudo apt-get install korean*
sudo apt-get install fonts-arphic-ukai fonts-arphic-uming fonts-ipafont-mincho fonts-ipafont-gothic fonts-unfonts-core
Question

When should I apply image cleanup operations on my document images?

Answer

There are a number of cleanup operations that you can use to make an image more suitable for a particular application. What you observe visually on the image and how you perceive its impact on your project is the most important. For example, if you’re noticing very many random specks on your image, and you’re planning to use OCR, then you may want to try a depseckle or blob removal operation first. If the content in your image looks a bit slanted, you could try a deskew or rotate operation. In some cases, using a line removal operation on forms that have grid fields could be helpful also. The amount of image cleaning you may need to do can very from project to project. There’s not a one shot cleaning operation that will always work for all images. But, observe the nature of the noise and interference in your images to determine what general parameters appear to provide the best results.

native excel support

Despite the explosive growth of big data and sophisticated analytics platforms, a 2019 study by Deloitte found that 67 percent of business leaders are not quite comfortable using them to inform decision making. For many organizations, spreadsheets remain the preferred tool for managing data and evaluating trends. Developers looking to build the next generation of business applications can accommodate those tendencies by integrating native spreadsheet support for Microsoft Excel workbooks.

Excel Worksheets vs Excel Workbooks

Although sometimes referred to interchangeably or described broadly as spreadsheets, there is a key distinction between an Excel worksheet and an Excel workbook. A worksheet consists of only one spreadsheet while a workbook contains multiple different spreadsheets separated by tabs.

The difference may not be very important when viewing or sharing XLSX files natively in Microsoft Excel, but it can create serious challenges when rendering those files in another application. Without some way of accurately rendering dynamic spreadsheet data, viewers are often forced to resort to a static print preview image. This process makes the file viewable, but also leaves it “flattened” because all interactive elements are removed from the spreadsheet cells.

If the workbook contains worksheets with linked data (that is, cell data from one sheet is affected by cell data from another sheet), it’s critical that a viewing solution preserves the dynamic aspects of the file. The advantage of a spreadsheet is that it can serve as a working document. Without the ability to interact with it, users might as well simply copy and paste the data into a text document.

Managing Excel Workbooks with PrizmDoc Cells

PrizmDoc Cells provides several options for managing Excel workbooks, making it easy to transition back and forth between XLSX format and web browser viewing. Once a proxy route is set up within the application to send API calls to the PrizmDoc Cells server, three different commands can be used to manage Excel workbooks.

Upload Workbook

This API call adds a new XLSX file for viewing and editing. When a document is uploaded to the system, the server assigns a unique workbook ID to it so it can be found and rendered in the application’s viewer in the future. After uploading a workbook, a new session can be created using the workbook ID for viewing and editing purposes. 

Download Workbook

When PrizmDoc Cells displays a spreadsheet, it renders the XLSX file itself, but it doesn’t make any alterations to that file. As each session makes edits to the workbook, those changes are associated with the document ID rather than the original XLSX file, which preserves the integrity of the original spreadsheet. At some point, however, those edits may need to be saved into a new Excel workbook. 

The download API call converts the current session document so it can be downloaded as an XLSX file. File availability can be set during the download process to control who will have access to the new workbook.

Delete Workbook

Old versions of workbooks often need to be deleted for security reasons, usually because they contain confidential data. Since the original XLSX file remains safely within application storage, there often isn’t much sense in retaining workbooks IDs that aren’t being used. The delete API call removes a workbook ID from the server. Once removed in this way, the workbook cannot be viewed, edited, or downloaded by PrizmDoc Cells.

Preserving Workbook Functionality

Since PrizmDoc Cells natively renders information contained in an XLSX file, it retains the dynamic elements that make spreadsheet workbooks so useful to organizations. Not only does it preserve proprietary business logic and formulas, but it also maintains the integrity of this information across multiple worksheets. Cell content can still be searched to quickly locate important text or data throughout the workbook.

For situations where proprietary formulas need to be protected, PrizmDoc Cells allows users to upload XLSX workbooks as values-only files, with all spreadsheet formulas removed. Also, any cells locked in an uploaded XLSX file will remain locked in PrizmDoc Cells to preserve workbook security.

True Spreadsheet Workbook Support for Your Applications

Many organizations continue to depend upon spreadsheet workbooks to manage their business. By providing feature-rich workbook support within their applications, developers can help them retain control over their proprietary spreadsheet formulas without sacrificing the functionality they expect from Excel. 

PrizmDoc Cells makes it easier than ever to share spreadsheet workbooks without having to rely upon Microsoft Excel dependencies. Shared XLSX files can remain safely within a secure application environment to prevent unauthorized downloads or troublesome version confusion. Get a first-hand look at how PrizmDoc Cells can enhance your application in our extensive online demo.

Question

In PrizmDoc, why do I fail to load/convert Excel documents with the error “Exception from HRESULT: 0x800AC472”?

Answer

The error message Exception from HRESULT: 0x800AC472 is usually associated with a failure involving an Excel document, found in the MsOfficeConverter.log. Below are some known triggers of it:

If the user is logged in as “SYSTEM”, “LocalSystem”, or any other non-user-account variant, this will cause PrizmDoc to fail when using MSO services. This is expected behavior when working with Microsoft Office documents in PrizmDoc. Please see step 6 of the Windows Installation documentation regarding this:

http://help.accusoft.com/PrizmDoc/latest/HTML/webframe.html#windows-installation.html

“Specify the login account (account name and password) that PrizmDoc Server will run under. If you are using the Microsoft Office (MSO) Conversion add-on, please make sure that the “login account” is a real user account with Administrator rights. Running PrizmDoc under the LocalSystem user or another Microsoft Windows integrated service account is not supported for this option.”

It’s also crucial that the copy of Microsoft Office on the system has been activated. A not-licensed, not-activated, expired, or trial license will all cause Microsoft Office to not work with PrizmDoc.

More information: https://help.accusoft.com/PrizmDoc/latest/HTML/windows-requirements.html

“The installed copy of Microsoft Office must be activated in order for PrizmDoc’s Microsoft Office Conversion Service to work properly. Not licensed, not activated, an expired or trial version of Microsoft Office will not work with PrizmDoc.”

Your default printer must be the Microsoft XPS Document Writer when working with Excel documents in PrizmDoc. Specifying another printer could possibly lead to this exception.

More information: http://help.accusoft.com/PrizmDoc/latest/HTML/natively-render-mso-documents.html

“The Microsoft Office Conversion Service requires the Microsoft XPS Document Writer printer driver to be installed for the best conversion performance and rendering fidelity of MS Excel documents”

Ensure the Print Spooler service is started and the Microsoft XPS Document Writer is the default printer.

There is a known issue with version 13.3 of PrizmDoc where completely blank Excel files are not loadable in the Viewer. They will fail to load and throw the aforementioned HRESULT exception. This has been fixed in PrizmDoc version 13.6.

In short, please set up the PrizmDoc service correctly to run with a real user account, ensure the copy of Microsoft Office has been activated, and make sure the default printer is set to “Microsoft XPS Document Writer”, then restart the service. This should fix this particular issue in most cases.


For more reading on considerations that Microsoft recommends when running their client-side MSO applications on the server, see this article:

Considerations for server-side Automation of Office

PDFs HTML embed

As digital processes become more commonplace, it’s more important than ever for organizations to have the tools in place to manage electronic documents effectively. The evolution of PDF viewing technology continues to provide new levels of flexibility for software applications. Now that HTML5 is capable of rendering PDF data within a conventional browser, developers are looking for new ways to make the viewing experience even more seamless. By embedding PDFs in HTML, they can continue to streamline document viewing and reduce the need for external software.

Why Embed a PDF in HTML?

Sharing a PDF online is far easier to do today than it was just a decade ago. For many years, the two most commonly used options were providing a link to download the file directly from a server or sending it as an attachment in an email. Once the file was downloaded, it could be opened and viewed with PDF reader software installed on a computer. This, of course, introduced numerous security risks that are associated with downloadable files and email attachments.

The widespread adoption of cloud storage has made it very convenient to share a PDF file and even manage who has access to it. And since most modern browsers can view PDFs without needing to download the file, providing a link is typically all that’s necessary to pass the file along.

While this solution is usually sufficient for the personal needs of an individual user, it’s not a practical option for even a small-scale business when it comes to public-facing document management. Organizations want to retain control over their files with respect to how they’re accessed and displayed. By embedding PDFs in HTML, they can keep their documents within their secure application environment where they have full control over how they’re managed, shared, and viewed. For developers looking to provide a seamless user experience, building options for embedded PDFs into their software is critically important.

The Value of an Integrated PDF Viewer

Since most modern browsers can utilize HTML5 to render PDF files, developers could lean on those capabilities without building a dedicated PDF viewer for their application. That decision will very quickly lead to some unpleasant complications, however. In the first place, they are leaving a lot to chance in terms of the viewing experience. Not every browser renders PDF files the same way, so it’s very possible that two different users could have two very different experiences when viewing a document. In some cases, that could mean nothing more than a missing font that’s replaced with an alternative. But in other cases, it could mean that the document doesn’t open at all or is missing important graphical elements.

This approach also forces users to make do with whatever PDF functionality is incorporated into their browser’s viewer. In most cases, that will mean subpar search performance, a lack of responsive mobile controls, and no annotation features. The browser may also have trouble with some of the less common PDF specifications, making it impossible for some users to even view a document.

By embedding a JavaScript-based PDF viewer into their application, developers can ensure that documents will display the correct way every time. Since the viewing is handled through a viewer embedded into the web application by default, it will be the same no matter what kind of browser or operating system is being used. A customizable viewer also allows developers to adjust the interface to permit or hide certain features, such as downloading or markup tools.

The open-source PDF.js library is a popular choice for many web applications, but it comes with a number of well-documented shortcomings. In addition to lacking key features like annotation, it also doesn’t support the entire PDF standard and does not provide a responsive UI for mobile devices. For developers looking to add more robust features, working with PDF.js often entails quite a bit of additional coding and engineering to build those capabilities from the ground up.

Embed PDFs in HTML with Accusoft PDF Viewer

Accusoft PDF Viewer takes the foundation of PDF.js and provides robust enhancements to meet the viewing needs of today’s applications. In addition to incredibly fast text search, expanded PDF standard support, and optimization for high-resolution displays, this lightweight SDK is also equipped with a responsive UI that adapts automatically to mobile screens. Developers can integrate essential mobile features like pinch to zoom quickly and easy, with no additional integrations or engineering required.

With no external dependencies or complicated server configurations, Accusoft PDF Viewer integrates into a web-based application with less than 10 lines of code. Once the viewer is in place, developers can embed PDFs in HTML and easily render them to provide a state-of-the-art PDF viewing experience regardless of the browser or device users have at their disposal. And since the UI can be customized to your application’s needs, there’s no reason to sacrifice control for the sake of viewing convenience.

Accusoft PDF Viewer is a JavaScript SDK that you can incorporate into your application environment quickly and easily to provide much greater viewing control and functionality than is possible with a standard browser viewer or base PDF.js library. If you’re planning to embed PDFs in HTML as part of your software solution, taking just a few moments to integrate versatile and responsive viewing tools can ensure a high-quality viewing experience. Download Accusoft PDF Viewer Standard Version today at no cost to see how easily it can transform your application’s HTML5 viewing potential.

For additional features like annotation, eSignature, and UI customization, contact one of our solutions experts to upgrade to Professional Version.

Question

I have installed PrizmDoc based on the documentation against a clean CentOS 7/RedHat 7 system, and Prizm services starts and is showing healthy. However, one of two issues are occurring:

  1. I cannot view HTML or picture files but can view PDF files.
  2. I cannot view PDF, Excel, or Word documents but can view HTML and Picture files.
Answer

If you cannot view HTML or picture files but can view PDF files, it is often due to specific required libraries not being installed. The following procedure can be executed on CentOS/RedHat 7 to ensure all required PrizmDoc libraries are installed.

  1. Stop the Prizm service: sudo /usr/share/prizm/scripts/pccis.sh stop

  2. Copy and paste all of the library installers into a terminal and wait for them to finish:

    yum install -y libbz2* libc* libcairo* libcups* libdbus-glib-1* libdl* libexpat* libfontconfig* libfreetype* libgcc_s* libgif* libGL* libjpeg* libm* libnsl* libopenjpeg* libpixman-1* libpng12* libpthread* librt* libstdc++* libthread_db* libungif* libuuid* libX11* libXau* libxcb* libXdmcp* libXext* libXi* libXinerama* libxml2* libXrender* libXtst* libz* linux-vdso*
    
  3. Restart the server.

If you cannot view PDF, Excel, or Word documents but can view HTML and Picture files, this is often due to installing the Generic PrizmDoc installer, which ends in either client_x86_64.tar.gz or server_x86_64.tar.gz. To resolve this issue you will need to re-install using the links that end in client_x86_64.rpm.tar.gz and server_RHEL7.tar.gz.

Understanding the Value of Third-Party Software Integrations
 

Today’s customers expect more of software applications than ever before. Piecemeal solutions that provide only a few noteworthy features are quickly being overtaken by more comprehensive platforms that deliver an end-to-end experience for users. This has prompted developers to incorporate more capabilities, while also building innovative features that set their solutions apart from the competition. Thanks to third-party software integrations, they’re able to meet both demands.

What is Third-Party Software Integration?

Third-party software integrations typically come in the form of SDKs or APIs that provide applications with specialized capabilities. Rather than building complex features like optical character recognition (OCR), PDF features, or image cleanup from scratch, developers can instead incorporate the necessary features directly into their software via an SDK or use an API call to access capabilities without expanding their application’s footprint.

From a user experience standpoint, third-party software integrations allow developers to build more cohesive software solutions that provide all the essential features a customer may require. Instead of pushing them into a separate application to interact with documents, provide a signature, or fill out a digital form, they can instead deliver an unbroken experience that’s easier to navigate and manage from start to finish.  

4 Key Third-Party Software Benefits

There are a number of important benefits organizations can gain from using third-party software integrations, but four stand out in particular:

1. Reduce Development Costs

When evaluating whether it makes sense to build functionality for an application in-house or buy a third-party software integration, cost is frequently one of the key considerations. There is often a tendency to think that it would be more cost-effective to have developers already working on the project simply build the capabilities they need on their own. After all, there’s no shortage of open-source SDKs and other tools that are available without having to pay licensing or product fees.

In practice, however, this approach usually ends up being more expensive in the long run. That’s because the developers working on the project often lack the experience needed to build those capabilities quickly. A software engineer hired to help build AI software, for instance, probably doesn’t know a lot about file conversion or annotation. While they might be able to find an open-source tool to build those features, they still need to do quite a bit of development work and on-the-job learning to get the new capabilities stood up and thoroughly tested. 

Focusing on these features means they’re not focusing on the more innovative aspects of their application. From a cost standpoint, that means they’re being paid to build something that’s already readily available in the market. When these internal development costs are taken into account, it’s almost always more cost effective to buy ready-to-implement software features built by an experienced third party. As the saying goes, there’s no reason to reinvent the wheel. 

2. Get to Market Faster

Software developers are always working against the clock. With new applications hitting the market faster than ever, there’s tremendous pressure to keep development timelines on track and avoid missing important deadlines. This helps projects stay within their expected budgets and prevents potential competitors from getting to market faster. Any steps that can be taken to accelerate development and potentially shorten the timeline to releasing a product could mean the difference between becoming an industry innovator or being labeled as an also-ran.

Third-party software integrations allow developers to quickly and seamlessly integrate essential capabilities into applications without compromising their project timeline. Rather than building features like forms processing, document annotation, and image conversion from scratch, teams can instead use third-party SDKs and APIs to add proven, reliable, and secure features in a fraction of the time. By keeping projects on or ahead of schedule, they can focus on delivering a better, more robust product that exceeds customer expectations. 

3. Expand Application Features & Functionality

Software development teams typically possess the experience and expertise needed to build the core architecture and innovative features of a new application. In many cases, they’re designing something novel that will provide a point of differentiation in the market. The more time they can spend on refining and expanding those capabilities, the more likely the application is to make an impact and win over customers.

What these developers often lack, however, are the skills needed to implement a variety of other features that will enhance the application’s functionality. Features like document conversion, OCR, PDF support, digital forms, eSignature, and image compression are complex and difficult to build from scratch. By integrating third-party software, developers can leverage proven, feature-rich technology to expand their application’s capabilities. This not only allows them to improve their solution’s versatility but also enhance the overall user experience by eliminating the need for external programs or troublesome plug-ins. 

4. Access Specialized Engineering Support

Incorporating features like PDF support, image conversion, and document redaction into an application poses several challenges. Some of those challenges don’t show up right away, instead, they become evident long after a software product launches. If the developers don’t have a lot of experience with the technology behind those features, minor issues can quickly escalate into serious problems that leave customers unhappy and willing to look elsewhere for alternatives. No organization wants to be caught in a situation where a bug embedded in an open-source tool renders a client’s valuable assets unusable.

By leveraging proven, tested, and secure third-party software integrations, developers gain access to support from experienced engineering teams with deep knowledge of their solutions. In addition to documentation and code samples, they can also speak directly with developers who can provide guidance on how to best integrate features and resolve issues when they emerge. The best integration providers will even work with organizations to customize their solutions to meet specific application needs, which helps create even smoother user experiences and enhances reliability.

Integrating Third-Party Software with Accusoft

For over 30 years, Accusoft has helped organizations add essential features like barcode recognition, file conversion, document assembly, and image compression to their applications through an innovative line of SDKs and APIs. Our document lifecycle technologies are backed by multiple patents and have been incorporated successfully into a wide range of applications. Our dedicated engineers provide ongoing support and work closely with customers to implement their specific use cases, ensuring that their software platform is delivering the best possible experience.

To learn more about integrating third-party software with Accusoft SDKs and APIs, talk to one of our solutions experts today.

 

Curious as to how to use PrizmDoc with Node.JS and HTML? You’ve found the right video! Watch as a Technical Support Rep takes you through the PrizmDoc Node.JS and HTML GitHub sample.

For additional information, please visit PrizmDoc!  To learn more about Accusoft, please visit www.Accusoft.com.