Technical FAQs

When it comes to downloading or viewing documents over the internet, PDFs have long served as a de facto standard for most organizations. Since PDFs are not a proprietary file format, there’s rarely any risk that someone will be unable to open them. However, just because PDFs have become so commonplace doesn’t mean that they all share the same characteristics. For anyone who has ever wondered why some PDFs seem to take so much longer to load than others, the answer often has less to do with connection and processing speeds as it does with the way the PDF’s content is organized.

More specifically, it’s a matter of whether or not the document is a linearized PDF.

What Is a Linearized PDF?

Sometimes called “fast web view,” linearization is a special way of saving a PDF file that organizes its internal components to make them easier to read when the file is streamed over a network connection. While a standard, non-linearized PDF stores information associated with each page across the entire file, linearized PDFs use an object tree format to consolidate page elements in an ordered, page by page basis. When a reader opens a linearized PDF, then, all of the information needed to render the first page is readily available, allowing it to load the page quickly without having to search the entire document for a specific object like an embedded font.

Originally introduced with the PDF 1.2 standard in 1996, linearized PDFs were critical to the format’s early internet success. In order to view a non-linearized PDF, the entire document needs to be downloaded or read via HTTP request-response transactions. Given the bandwidth limitations of early internet connections (often still between 28.8k and 33.6k in 1996), this created a serious bottleneck problem when it came to document viewing. While it was possible to view a document without downloading it, the multiple HTTP requests needed to do so could easily be disrupted if the connection was lost, something that was all too common in the days before reliable broadband connections were introduced.

Non-Linearized vs Linearized PDFs

To visualize the difference between a non-linearized PDF and a linearized PDF, imagine two separate people sitting down to file their business taxes. One person has all of their receipts, invoices, and financial documents scattered across their office, with some stacked in unordered piles, others crammed into unlabeled folders, and even more stuffed into assorted drawers and file cabinets. Finding and organizing all of this documentation would take almost as much time as actually filing the taxes themselves! The second person, however, has all of the records they need stored in a neatly labeled file cabinet, allowing them to retrieve everything quickly and easily.

The first example is similar to a non-linearized PDF, while the second shows how much easier it is for a reader to access the information it needs to render the file. Even better, since each page is organized in the same way, jumping to a different page in a multi-page PDF doesn’t require the reader to reload the entire file. It can simply read the current page and get everything necessary to display the PDF correctly.

Why Linearized PDFs Are Still Valuable

In a world dominated by high speed internet connections, it’s fair to wonder whether or not PDF linearization is still necessary. For small PDFs that are only a few pages, linearization may not be essential, but when it comes to larger documents, linearization can still deliver substantial performance and user experience benefits.

Consider, for instance, a document that consists of several hundred, or even several thousand, pages. Loading that entire document and keeping it cached may be possible, but it’s an inefficient use of processing and bandwidth resources. With a linearized PDF, a reader typically encounters a linearization directory and hint tables at the top of the document, which provides it with instructions on where to locate any necessary resources within the file. After loading the hint tables and the first page, the reader stops the download process rather than opening the entire file. When the user navigates to another page, the reader can quickly reference the hint tables and jump to that page.

This ensures that the reader is only ever loading the pages that actually need to be displayed, which helps to conserve memory, processing resources, and bandwidth. For mobile devices with limited file and cache storage, linearized PDFs are much easier to manage than their non-linearized counterparts. They also provide some protection against network interruptions, which could make it difficult to download and view an entire document.

How to Linearize PDFs

Although the linearization process is well laid out in the current PDF standards documentation, many PDFs are created using software that doesn’t automatically linearize the content. More importantly, some linearized PDFs are “broken” by a process called incremental saving, which saves minor updates at the end of the file, rather than changing existing structure. Over time, too much incremental saving can undermine the effectiveness of a linearized PDF.

The best way to resolve such problems and linearize the PDF is to save a new, linearized version of the file using PDF editing and conversion tools.

Take Control of PDFs with PrizmDoc

Accusoft’s PrizmDoc provides a broad range of document functionality that allows applications to more effectively create, convert, and compress PDF files.

For a closer look at PrizmDoc and to see its powerful document processing capabilities in action, download a free trial today.

October 11, 2023 – Tampa, FLAccusoft is pleased to announce the newest additions to PrizmDoc’s industry-leading document processing capabilities: video playback and an advanced optical character recognition (OCR) API integration. These new additions allow PrizmDoc to provide even more support to developers looking to add essential features to their applications.

PrizmDoc’s new video playback feature makes it easy for clients to natively embed videos into their software without having to rely on external hosting platforms or third-party plug-ins. The feature not only enhances security, but also delivers a seamless user experience that today’s customers expect from their applications. 

With PrizmDoc’s new advanced OCR API, web developers can now access Accusoft’s optical character recognition technology that was previously only accessible via an SDK.  The features included in the new OCR API enable full page and zonal recognition for document and forms processing as well as support for location and confidence information for each character. With a simple API call, PrizmDoc can extract searchable text from any supported raster file. The new OCR API add-on option for PrizmDoc also offers support for 60+ languages plus an option for Asian languages. 

“Today’s applications need more than the ability to view and manage documents,” says Jack Berlin, CEO of Accusoft. “By enabling video playback and allowing developers to tap into our proven OCR technology with a simple API call, we’re making it easier for PrizmDoc customers to deliver an all-in-one solution for their customers that provides a better overall user experience.”

To learn more about PrizmDoc or to download a free trial and experience the new video playback and OCR API features first-hand, visit our website.

About Accusoft

Founded in 1991, Accusoft is a software development company specializing in document processing, conversion, and automation solutions. From out-of-the-box and configurable applications to APIs built for developers, Accusoft software enables users to solve th most complex workflow challenges and gain insights from content in any format, on any device. Backed by 40 patents, the company’s flagship products, including OnTask, PrizmDoc, and ImageGear, are designed to improve productivity, provide actionable data, and deliver results that matter. The Accusoft team is dedicated to continuous innovation through customer-centric product development, new version release, and a passion for understanding industry trends that drive consumer demand. Visit us at www.accusoft.com.

Question

In PrizmDoc, why do I fail to load/convert Excel documents with the error “Exception from HRESULT: 0x800AC472”?

Answer

The error message Exception from HRESULT: 0x800AC472 is usually associated with a failure involving an Excel document, found in the MsOfficeConverter.log. Below are some known triggers of it:

If the user is logged in as "SYSTEM", "LocalSystem", or any other non-user-account variant, this will cause PrizmDoc to fail when using MSO services. This is expected behavior when working with Microsoft Office documents in PrizmDoc. Please see step 6 of the Windows Installation documentation regarding this:

http://help.accusoft.com/PrizmDoc/latest/HTML/webframe.html#windows-installation.html

"Specify the login account (account name and password) that PrizmDoc Server will run under. If you are using the Microsoft Office (MSO) Conversion add-on, please make sure that the "login account" is a real user account with Administrator rights. Running PrizmDoc under the LocalSystem user or another Microsoft Windows integrated service account is not supported for this option."

It’s also crucial that the copy of Microsoft Office on the system has been activated. A not-licensed, not-activated, expired, or trial license will all cause Microsoft Office to not work with PrizmDoc.

More information: https://help.accusoft.com/PrizmDoc/latest/HTML/windows-requirements.html

"The installed copy of Microsoft Office must be activated in order for PrizmDoc’s Microsoft Office Conversion Service to work properly. Not licensed, not activated, an expired or trial version of Microsoft Office will not work with PrizmDoc."

Your default printer must be the Microsoft XPS Document Writer when working with Excel documents in PrizmDoc. Specifying another printer could possibly lead to this exception.

More information: http://help.accusoft.com/PrizmDoc/latest/HTML/natively-render-mso-documents.html

"The Microsoft Office Conversion Service requires the Microsoft XPS Document Writer printer driver to be installed for the best conversion performance and rendering fidelity of MS Excel documents"

Ensure the Print Spooler service is started and the Microsoft XPS Document Writer is the default printer.

There is a known issue with version 13.3 of PrizmDoc where completely blank Excel files are not loadable in the Viewer. They will fail to load and throw the aforementioned HRESULT exception. This has been fixed in PrizmDoc version 13.6.

In short, please set up the PrizmDoc service correctly to run with a real user account, ensure the copy of Microsoft Office has been activated, and make sure the default printer is set to "Microsoft XPS Document Writer", then restart the service. This should fix this particular issue in most cases.


For more reading on considerations that Microsoft recommends when running their client-side MSO applications on the server, see this article:

Considerations for server-side Automation of Office

Question

In PrizmDoc, why do I fail to load/convert Excel documents with the error “Exception from HRESULT: 0x800AC472”?

Answer

The error message Exception from HRESULT: 0x800AC472 is usually associated with a failure involving an Excel document, found in the MsOfficeConverter.log. Below are some known triggers of it:

If the user is logged in as “SYSTEM”, “LocalSystem”, or any other non-user-account variant, this will cause PrizmDoc to fail when using MSO services. This is expected behavior when working with Microsoft Office documents in PrizmDoc. Please see step 6 of the Windows Installation documentation regarding this:

http://help.accusoft.com/PrizmDoc/latest/HTML/webframe.html#windows-installation.html

“Specify the login account (account name and password) that PrizmDoc Server will run under. If you are using the Microsoft Office (MSO) Conversion add-on, please make sure that the “login account” is a real user account with Administrator rights. Running PrizmDoc under the LocalSystem user or another Microsoft Windows integrated service account is not supported for this option.”

It’s also crucial that the copy of Microsoft Office on the system has been activated. A not-licensed, not-activated, expired, or trial license will all cause Microsoft Office to not work with PrizmDoc.

More information: https://help.accusoft.com/PrizmDoc/latest/HTML/windows-requirements.html

“The installed copy of Microsoft Office must be activated in order for PrizmDoc’s Microsoft Office Conversion Service to work properly. Not licensed, not activated, an expired or trial version of Microsoft Office will not work with PrizmDoc.”

Your default printer must be the Microsoft XPS Document Writer when working with Excel documents in PrizmDoc. Specifying another printer could possibly lead to this exception.

More information: http://help.accusoft.com/PrizmDoc/latest/HTML/natively-render-mso-documents.html

“The Microsoft Office Conversion Service requires the Microsoft XPS Document Writer printer driver to be installed for the best conversion performance and rendering fidelity of MS Excel documents”

Ensure the Print Spooler service is started and the Microsoft XPS Document Writer is the default printer.

There is a known issue with version 13.3 of PrizmDoc where completely blank Excel files are not loadable in the Viewer. They will fail to load and throw the aforementioned HRESULT exception. This has been fixed in PrizmDoc version 13.6.

In short, please set up the PrizmDoc service correctly to run with a real user account, ensure the copy of Microsoft Office has been activated, and make sure the default printer is set to “Microsoft XPS Document Writer”, then restart the service. This should fix this particular issue in most cases.


For more reading on considerations that Microsoft recommends when running their client-side MSO applications on the server, see this article:

Considerations for server-side Automation of Office

Question

If you have a copy of ImagXpress, there are cases where calling certain functions will trigger the following message:

"This function is available in another edition of ImagXpress v13.00
control"

What could cause this error?

Answer

This error can occur if ImagXpress Standard Edition is licensed on a system, but you’re trying to call operations that are only available in ImagXpress Professional Edition. So, you will need to ensure you’re using the proper license.

This documentation page specifies the functions that are supported for each edition.

This can also occur if you own both Barcode Xpress and ImagXpress Professional. Barcode Xpress includes ImagXpress Standard Edition, so if you install the ImagXpress Professional license first, and then install Barcode Xpress, the included Standard license will overwrite the Professional license. The resolution in this case is to re-install the ImagXpress license to overwrite Standard with Professional.

 

NEWS PROVIDED BY

TAMPA, FL, UNITED STATES, August 19, 2021 — The Tampa Bay Software CEO Council, founded by Tampa Bay Tech, selected Tampa Bay Area non-profit Computer Mentors as the recipient of the group’s annual fundraising efforts. This month, they presented Computer Mentors founder and Executive Director Ralph Smith with a check for more than $10,000.

“Every year we look for a local charity to connect with and the work Computer Mentors is doing to promote tech with area kids completely aligns with our mission,” said Jack Berlin, CEO of Accusoft. “The work they’re doing to empower kids to pursue careers in tech is instrumental to the future of Tampa Bay as a growing tech hub.”

Computer Mentors works to build opportunity through expertise for the underserved youth of the community. By establishing and building upon a fundamental skillset covering programming, entrepreneurism, public speaking, and more; Computer Mentors gives its students the tools and talent they need to become savvy, self-starting achievers in a tech-centric world.

The Software CEO Council comprises the area’s premier businesses, executives, and entrepreneurs of Tampa Bay’s technology community. Its mission is to create the largest communal ecosystem for tech startups in the state of Florida and put Tampa Bay on the map as a beacon for innovation and success, to foster talent and fuel growth. Council companies include A-LIGN, Accusoft, AgileThought, Bond-Pro, CrossBorder Solutions, Digital Hands, Geographic Solutions, Haneke Design, MercuryWorks, Sourcetoad, Spirion and SunView Software.

https://www.tampasoftwareceos.com/

“Tampa Bay Tech’s Software CEO Council represents several of our area’s most innovative, growing companies, and we are honored to be the recipient of their generous gift to our kids,” said Smith. “Donations like this help fund much-needed programs to help level the playing field for our kids and develop the next generation of talent right here in Tampa Bay.”

About Tampa Bay Tech:

Tampa Bay Tech is a 501(c) 6 non-profit technology council that has been engaging and uniting the local technology community for 20 years. Through their membership and partnerships their mission is to build a radically connected, flourishing tech hub where opportunity is abundant for all. With over 125 companies representing thousands of tech employees – as well as thousands of students within the area’s colleges and universities – Tampa Bay Tech provides programming and initiatives to connect the community, provide development opportunities, and support Tampa Bay’s growing workforce.

Jill St Thomas
Tampa Bay Tech
jill@tampabay.tech

About Accusoft: 

Founded in 1991, Accusoft is a software development company specializing in content processing, conversion, and automation solutions. From out-of-the-box and configurable applications to APIs built for developers, Accusoft software enables users to solve their most complex workflow challenges and gain insights from content in any format, on any device. Backed by 40 patents, the company’s flagship products, including OnTask, PrizmDoc™ Viewer, and ImageGear, are designed to improve productivity, provide actionable data, and deliver results that matter. The Accusoft team is dedicated to continuous innovation through customer-centric product development, new version release, and a passion for understanding industry trends that drive consumer demand. Visit us at www.accusoft.com.

Why Your Application Needs a Built-in PDF Reader

Managing and viewing documents is critical to providing a quality user experience in today’s applications. Without some way of controlling the presentation of digital files like PDFs, organizations put themselves in a situation where they must rely on external solutions that may not be responsive to their needs. PDF integration into their applications helps developers to maintain control over their documents while providing a more consistent viewing experience for users.

What Are Your PDF Reader Options?

Sharing and viewing PDFs online has become much easier with the development of HTML5 viewing technology and PDF.js-based software. For many years, the only way to view a PDF was to download a file and open it using a dedicated PDF reader application. Although many of these readers could be added to a web browser using a plug-in, this wasn’t always a reliable solution and inconsistent support for these extensions often created security risks.

After Mozilla introduced the PDF.js open-source library in 2011, integrated PDF viewing quickly became an essential feature for web browsers. Most users now simply take PDF viewing for granted, trusting that their browser will be able to open and read any file. For some organizations, relying on a browser PDF reader is a perfectly reasonable solution, especially if they don’t have any concerns over controlling the document viewing experience.

But for many developers building web applications, these browsers and external PDF readers put them at the mercy of third-party providers. Changes or security problems with these solutions can leave development teams scrambling to implement workarounds that could have been avoided if they had their own dedicated viewing solution. That’s why applications increasingly feature a built-in PDF reader that allows them to better manage and present important digital documents.

Why Your Application Needs a Built-in PDF Reader

The core problem with relying on an external viewing solution comes down to control. In order to view a PDF in a dedicated reader, the file needs to be downloaded. Once that document is removed from a secure application, it could easily be distributed or altered without any authorization or oversight. This often results in serious version confusion that leaves everyone wondering which version of a PDF is the most up-to-date. By keeping documents within a controlled application, developers can ensure that the files viewed there are current.

Relying on external PDF viewers can also create an inconsistent user experience. Since not all viewers render documents, in the same way, it’s impossible to control what someone will see when they open a given PDF. In some cases, that could result in wrong fonts being displayed or some image layers failing to render properly. But it may also prevent someone from even viewing a file at all. For example, browser-based viewers that use the base PDF.js library without making any improvements to it often struggle to render lengthy or complex files. 

When applications incorporate a built-in PDF reader, developers can ensure that every document viewed within that solution will look the same on every device (and that it will open in the first place!). This level of control is incredibly important for organizations looking to build a frictionless and compelling user experience.

Integrating a PDF Reader

By incorporating a PDF reader into their web-based applications, developers are able to both retain full control over the viewing experience and keep files within a protected environment. When users are interacting with the application, all PDF viewing can be handled by the built-in viewer rather than handed off to external software. This makes it easier to manage access effectively and limits the number of downloads. 

Since every user will be viewing documents through the same built-in PDF reader, developers can also craft a consistent experience across multiple platforms. With more and more people accessing their applications with mobile devices, it’s important for development teams to offer responsive viewing solutions that can accommodate various screen sizes and interfaces.

In order to maintain complete control over files and deliver better performance, a built-in PDF reader should be able to operate as an entirely client-side solution. Whether it’s running within an on-premises technology stack or as part of an application’s cloud deployment, a PDF viewer without any complicated dependencies never has to worry about connecting to a third-party service to facilitate viewing. 

But why stop at PDF viewing?

PDF Editing

Often users need the ability to view as well as collaborate on their PDF documents, and providing the ability to edit those documents presents a challenge for developers. In a recent survey conducted amongst developers, there appears to be a disconnect between the PDF editing features that are available in most applications, to what developers actually need to fulfill and enhance their applications. So what’s the solution? 

Third-party Integrated PDF Viewing and Editing

A PDF solution provider has already worked out the challenges associated with viewing and editing PDF documents within an application. They’ve also devoted their resources to improving their document capabilities and expanding features to offer greater flexibility.

A good third-party provider also offers extensive support during and after the implementation process. If the developer needs to add a new PDF-related capability to their application or if they encounter a problem, they can quickly resolve the issue by working with their provider rather than wasting valuable resources trying to identify and fix the problem themselves. That combination of expertise and service means that developers can spend more time focusing on their application’s unique features rather than continuously wrestling with PDF-related challenges.

Enhance Your Application with PDF Integrations from Accusoft

With more than three decades of experience managing documents and images, Accusoft has been building innovative PDF solutions since the format was first introduced. Whether you need to add flexible front-end viewing and editing features to your application or are looking to add powerful programmatic PDF capabilities into the back end of your software, we provide a wide range of PDF solutions that address multiple development needs.

To learn more about how Accusoft can solve your PDF document management challenges, talk to one of our PDF specialists today and find the integration that works best for your software project.

 

native excel support

Despite the explosive growth of big data and sophisticated analytics platforms, a 2019 study by Deloitte found that 67 percent of business leaders are not quite comfortable using them to inform decision making. For many organizations, spreadsheets remain the preferred tool for managing data and evaluating trends. Developers looking to build the next generation of business applications can accommodate those tendencies by integrating native spreadsheet support for Microsoft Excel workbooks.

Excel Worksheets vs Excel Workbooks

Although sometimes referred to interchangeably or described broadly as spreadsheets, there is a key distinction between an Excel worksheet and an Excel workbook. A worksheet consists of only one spreadsheet while a workbook contains multiple different spreadsheets separated by tabs.

The difference may not be very important when viewing or sharing XLSX files natively in Microsoft Excel, but it can create serious challenges when rendering those files in another application. Without some way of accurately rendering dynamic spreadsheet data, viewers are often forced to resort to a static print preview image. This process makes the file viewable, but also leaves it “flattened” because all interactive elements are removed from the spreadsheet cells.

If the workbook contains worksheets with linked data (that is, cell data from one sheet is affected by cell data from another sheet), it’s critical that a viewing solution preserves the dynamic aspects of the file. The advantage of a spreadsheet is that it can serve as a working document. Without the ability to interact with it, users might as well simply copy and paste the data into a text document.

Managing Excel Workbooks with PrizmDoc Cells

PrizmDoc Cells provides several options for managing Excel workbooks, making it easy to transition back and forth between XLSX format and web browser viewing. Once a proxy route is set up within the application to send API calls to the PrizmDoc Cells server, three different commands can be used to manage Excel workbooks.

Upload Workbook

This API call adds a new XLSX file for viewing and editing. When a document is uploaded to the system, the server assigns a unique workbook ID to it so it can be found and rendered in the application’s viewer in the future. After uploading a workbook, a new session can be created using the workbook ID for viewing and editing purposes. 

Download Workbook

When PrizmDoc Cells displays a spreadsheet, it renders the XLSX file itself, but it doesn’t make any alterations to that file. As each session makes edits to the workbook, those changes are associated with the document ID rather than the original XLSX file, which preserves the integrity of the original spreadsheet. At some point, however, those edits may need to be saved into a new Excel workbook. 

The download API call converts the current session document so it can be downloaded as an XLSX file. File availability can be set during the download process to control who will have access to the new workbook.

Delete Workbook

Old versions of workbooks often need to be deleted for security reasons, usually because they contain confidential data. Since the original XLSX file remains safely within application storage, there often isn’t much sense in retaining workbooks IDs that aren’t being used. The delete API call removes a workbook ID from the server. Once removed in this way, the workbook cannot be viewed, edited, or downloaded by PrizmDoc Cells.

Preserving Workbook Functionality

Since PrizmDoc Cells natively renders information contained in an XLSX file, it retains the dynamic elements that make spreadsheet workbooks so useful to organizations. Not only does it preserve proprietary business logic and formulas, but it also maintains the integrity of this information across multiple worksheets. Cell content can still be searched to quickly locate important text or data throughout the workbook.

For situations where proprietary formulas need to be protected, PrizmDoc Cells allows users to upload XLSX workbooks as values-only files, with all spreadsheet formulas removed. Also, any cells locked in an uploaded XLSX file will remain locked in PrizmDoc Cells to preserve workbook security.

True Spreadsheet Workbook Support for Your Applications

Many organizations continue to depend upon spreadsheet workbooks to manage their business. By providing feature-rich workbook support within their applications, developers can help them retain control over their proprietary spreadsheet formulas without sacrificing the functionality they expect from Excel. 

PrizmDoc Cells makes it easier than ever to share spreadsheet workbooks without having to rely upon Microsoft Excel dependencies. Shared XLSX files can remain safely within a secure application environment to prevent unauthorized downloads or troublesome version confusion. Get a first-hand look at how PrizmDoc Cells can enhance your application in our extensive online demo.