Technical FAQs

Question

I am combining multiple PDF documents together, and I need to create a new bookmark collection, placed at the beginning of the new document. Each bookmark should go to a specific page or section of the new document.
Example structure:

  • Section 1
    • Document 1
  • Section 2
    • Document 2

How might I do this using ImageGear .NET?

Answer

You are adding section dividers to the result document. So, for example, if you are to merge two documents, you might have, say, two sections, each with a single document, like so…

  • Section 1
    • Document 1
  • Section 2
    • Document 2

…The first page will be the first header page, and then the pages of Document 1, then another header page, then the pages of Document 2. So, the first header page is at index 0, the first page of Document 1 is at index 1, the second header is at 1 + firstDocumentPageCount, etc.

The following code demonstrates adding some blank pages to igResultDocument, inserting pages from other ImGearPDFDocuments, and modifying the bookmark tree such that it matches the outline above, with "Section X" pointing to the corresponding divider page and "Document X" pointing to the appropriate starting page number…

// Create new document, add pages
ImGearPDFDocument igResultDocument = new ImGearPDFDocument();
igResultDocument.CreateNewPage((int)ImGearPDFPageNumber.BEFORE_FIRST_PAGE, new ImGearPDFFixedRect(0, 0, 300, 300));
igResultDocument.InsertPages((int)ImGearPDFPageNumber.LAST_PAGE, igFirstDocument, 0, (int)ImGearPDFPageRange.ALL_PAGES, ImGearPDFInsertFlags.DEFAULT);
igResultDocument.CreateNewPage(igFirstDocument.Pages.Count, new ImGearPDFFixedRect(0, 0, 300, 300));
igResultDocument.InsertPages((int)ImGearPDFPageNumber.LAST_PAGE, igSecondDocument, 0, (int)ImGearPDFPageRange.ALL_PAGES, ImGearPDFInsertFlags.DEFAULT);

// Add first Section
ImGearPDFBookmark resultBookmarkTree = igResultDocument.GetBookmark();
resultBookmarkTree.AddNewChild("Section 1");
var child = resultBookmarkTree.GetLastChild();
int targetPageNumber = 0;
setNewDestination(igResultDocument, targetPageNumber, child);

// Add first Document
child.AddNewChild("Document 1");
child = child.GetLastChild();
targetPageNumber = 1;
setNewDestination(igResultDocument, targetPageNumber, child);

// Add second Section
resultBookmarkTree.AddNewChild("Section 2");
child = resultBookmarkTree.GetLastChild();
targetPageNumber = 1 + igFirstDocument.Pages.Count;
setNewDestination(igResultDocument, targetPageNumber, child);

// Add second Document
child.AddNewChild("Document 2");
child = child.GetLastChild();
targetPageNumber = 2 + igFirstDocument.Pages.Count;
setNewDestination(igResultDocument, targetPageNumber, child);

// Save
using (FileStream stream = File.OpenWrite(@"C:\path\here\test.pdf"))
{
    igResultDocument.Save(stream, ImGearSavingFormats.PDF, 0, 0, igResultDocument.Pages.Count, ImGearSavingModes.OVERWRITE);
}

...

private ImGearPDFDestination setNewDestination(ImGearPDFDocument igPdfDocument, int targetPageNumber, ImGearPDFBookmark targetNode)
{
    ImGearPDFAction action = targetNode.GetAction();
    if (action == null)
    {
        action = new ImGearPDFAction(
            igPdfDocument,
            new ImGearPDFDestination(
                igPdfDocument,
                igPdfDocument.Pages[targetPageNumber] as ImGearPDFPage,
                new ImGearPDFAtom("XYZ"),
                new ImGearPDFFixedRect(), 0, targetPageNumber));
        targetNode.SetAction(action);
    }
    return action.GetDestination();
}

(The setNewDestination method is a custom method that abstracts the details of adding the new destination.)

Essentially, the GetBookmark() method will allow you to get an instance representing the root of the bookmark tree, with its children being subtrees themselves. Thus, we can add a new child to an empty tree, then get the last child with GetLastChild(). Then, we can set the action for that node to be a new "GoTo" action that will navigate to the specified destination. Upon save to the file system, this should produce a PDF with the below bookmark structure…

Bookmarks example

Note that you may need to use the native Save method (NOT SaveDocument) described in the product documentation here in order to save a PDF file with the bookmark tree included. Also, you can read more about Actions in the PDF Specification.

Question

I am trying to deploy my ImageGear Pro ActiveX project and am receiving an error stating

The module igPDF18a.ocx failed to load

when registering the igPDF18a.ocx component. Why is this occurring, and how can I register the component correctly?

Answer

To Register your igPDF18a.ocx component you will need to run the following command:

regsvr32 igPDF18a.ocx

If you receive an error stating that the component failed to load, then that likely means that regsvr32 is not finding the necessary dependencies for the PDF component.

The first thing you will want to check is that you have the Microsoft Visual C++ 10.0 CRT (x86) installed on the machine. You can download this from Microsoft’s site here:

https://www.microsoft.com/en-us/download/details.aspx?id=5555

The next thing you will want to check for is the DL100*.dll files. These files should be included in the deployment package generated by the deployment packaging wizard if you included the PDF component when generating the dependencies. These files must be in the same folder as the igPDF18a.ocx component in order to register it.

With those dependencies, you should be able to register the PDF component with regsvr32 without issue.

As speed and efficiency have become increasingly vital for business success, it’s hardly a surprise that organizations across many industries have turned to white labeling as a way of retaining their competitive edge. While white labeling can be found in every industry, it’s proved to be incredibly important in the technology sector, where many independent software vendors rely upon white label software to build better applications and solutions.

How Does White Labeling Work?

In many instances, organizations need to launch a product quickly and may not have expertise in some aspect of their business. A software developer that specializes in FinTech solutions for processing loan applications, for example, may have the machine learning tools to sift through documents quickly but lack the viewing and editing features that would allow users to collaborate securely and effectively.

White labeling is a process where one company purchases a product from another company and then rebrands it for their own use. For physical products, this usually means repackaging and reselling something, but with digital products, the rebranding typically involves customizing the user experience (UI) to incorporate it into an existing application. 

In the previous example, the FinTech developer might turn to a product like PrizmDoc Viewer to integrate secure viewing capabilities into their platform. Using PrizmDoc Viewer’s white label software features, the company could rebrand the UI with its own logo and terminology. The average user, then, would never know that some aspects of their FinTech solution incorporates products made by another software manufacturer.

4 Benefits of Using White Label Software

Companies of all sizes turn to white label solutions when building their applications. Here are some of the reasons why they they choose this option instead of building everything they need from scratch:

1. Rapid Deployment

One of the obvious advantages to adopting a white label software solution is the speed of deployment. Building new features within an application takes both time and developer resources. If everything needs to be coded and deployed from scratch, production timelines can quickly extend indefinitely. As deadlines are pushed back, developers may even be pulled away from working on more innovative software features to build basic functionality into their applications. This vicious cycle makes it bring a product to market in time to capitalize on opportunities.

With easy-to-integrate white label software, companies can rapidly integrate the functionality they need into their applications and dedicate more resources to the novel features that will set them apart in a crowded marketplace. The ability to deploy core capabilities quickly means that teams can get to a minimum viable product faster and bring their software to market. Since white labeling allows them to rebrand and customize their integration to match the rest of the application interface, end users still enjoy a seamless experience.

2. Proven Functionality

When production deadlines are tight, it usually doesn’t make sense to have developers spending their time building a solution that already exists as a ready-made integration. Although open-source tools can be quite attractive, they don’t always provide the robust features expected of modern applications. In many cases, development teams have to spend valuable time building upon open-source solutions just to get them to work properly within an application framework.

White label software provides proven functionality right out of the box, allowing developers to quickly integrate the features they need and get back to working on other priorities. They also offer a more specialized approach to application needs. Rather than trying to build something new or adapt a similar solution, developers can select the exact features they need, implement them, and know that they’ll work as promised. Since they’re supported products, white label software also provides more peace of mind when it comes to updates and patching vulnerabilities. 

3. Budget Friendly

Developing new software features is an expensive undertaking. It requires companies to hire developers with the right expertise, dedicate hardware and processing capabilities, and multiple rounds of testing just to get to a viable state, but the costs don’t stop there. Once new functionality is built, it will require ongoing maintenance and support to ensure that it continues to work as intended and stays secure against potential threats. Those additional costs can quickly become burdensome for a software company that simply wants to incorporate common features into their application.

With white label software, companies can have all the benefits of third party support without compromising their user experience. White label API solutions integrate seamlessly into an application and the company can turn to the vendor for support when something goes wrong or when new features are needed. Eliminating ongoing maintenance from the budget means that teams can spend more of their resources on delivering a better overall product to their customers. 

4. Flexible Scalability

Software applications often change significantly throughout their lifecycle. What begins as a small program with only a few features can scale very quickly into an enterprise-grade workhorse that must meet the needs of multiple departments. Having a solution in place that can grow and change along with business needs is vital for organizations looking to retain their flexibility.

Customizable white label software makes it easy for companies to grow their applications along with their business. They can begin with a modest feature set and then implement additional tools as needed as user needs change over time. This versatility also frees up developers to build innovative solutions that may require features that are not being used in an existing application, but could easily be enabled when the time comes. Having flexible, scalable white-label technology built into a platform early on opens up a wide range of possibilities for future development.

The Behind-the-Scenes Ingredient to Your Application’s Success

As a software manufacturer specializing in API technologies for document processing, conversion, and automation, Accusoft has spent many years building solutions that work “under the hood” to enhance our customers’ applications. That’s why the PrizmDoc Suite of products incorporates white label software features to help them blend seamlessly into your existing platform. Whether you’re looking to add new capabilities or need to incorporate functionality quickly to get your products to market faster, our flexible integrations can help solve your document management challenges on your terms. Talk to our team today to find out which solution is right for you.

Few organizations will view the final weeks of 2020 as a bittersweet moment. In addition to the staggering human toll inflicted by the COVID-19 pandemic, entire industries saw longstanding business models upended, forcing companies to completely rethink their relationships with employees, vendors, and customers. The financial services industry was no exception, and 2020 saw FinTech becoming more important than ever as firms rushed to embrace digital transformation in response to the ongoing crisis. Many of these trends appear poised to continue well into 2021 and beyond. 

FinTech Defined

FinTech is short for “financial technology,” but the term itself is applied quite broadly throughout the financial services industry. It can be used to refer to a new generation of non-traditional startup companies focused on building digital tools that allow people to manage their finances in new ways that disrupt established industry practices. The term is also sometimes used to describe the technology itself, however, especially since established financial organizations are investing heavily in innovative applications and services of their own.

FinTech Trends for 2021

Although 2020 is sure to be remembered as a year of unprecedented disruption, 2021 might well come to be known as a year of remarkable adaptation and transformation. Now that organizations have developed innovative digital strategies to navigate a more volatile economic landscape, they must now take up the challenge of putting those plans into practice.

FinTech developers need to keep an eye on these trends as they build new applications and services in order to provide the functionality and performance demanded by the financial industry. Many established firms will be taking a long look at their infrastructure and technology solutions to assess whether or not their current systems are up to the challenge of digital transformation. If their existing platforms fall short, they will need to either seek out new FinTech products with more robust feature sets or explore options for integrating new capabilities into their legacy software.

Top 5 FinTech Trends to Watch in 2021

1. Customer-Centric Applications

The proliferation of FinTech solutions has brought customers to the forefront of every financial organization’s thoughts. Where the financial industry once designed processes and applications to suit their own needs, today they must focus on delivering a high-quality customer experience if they want to remain competitive in a crowded marketplace. The process often begins with reducing friction wherever possible to help end-users get the products and services they need faster. With customers increasingly interacting with the financial industry across multiple channels, FinTech developers must build solutions that strengthen those connections and expand their potential.

Eliminating manual processes, cutting down on external software dependencies, and automating routine tasks will continue to be a major point of emphasis for FinTech applications. Customers no longer have the patience to repeatedly fill out lengthy forms or go through the frustrating process of downloading, printing, signing, and scanning documents. By building document viewing, file conversion, and data capture capabilities into their applications, FinTech developers can provide firms with a unified digital solution that addresses multiple needs and streamlines their customer experience.

2. Digital-First Collaboration

According to an IDG study on the enduring business impacts of the COVID 19 pandemic, about 40% of employees are expected to be working remotely on a semi-permanent basis as of January 2021. That means financial organizations will continue to need digital tools in place to provide secure access to files and facilitate collaboration. Physical documents must first be converted into a variety of digital formats with high levels of accuracy and then made available to remote users without compromising data integrity or creating confusion over version history. 

Without a dedicated solution on hand for viewing, editing, and managing documents, users are forced to resort to a variety of ad hoc workarounds and third party software solutions that can quickly compromise data security and increase the likelihood of errors. By integrating those features into their FinTech applications, developers can help firms keep all of their documents and files safely within a secure infrastructure while still making them available through easy-to-use web-based API tools.

3. Big Data Management

Financial organizations continue to collect huge amounts of data in the course of their business. Some of this data is unstructured and must be processed using powerful analytics tools to identify important trends and potential risks that can help firms make better strategic decisions. But they also gather a great deal of structured data as well, typically from structured forms like loan applications, tax documents, and bank statements. Managing all of this information more efficiently will be an important goal for 2021 because having good data insights is essential for identifying opportunities, optimizing products and services, and automating essential services.

FinTech developers can help improve data processing by building applications capable of extracting information quickly and accurately. Financial data algorithms are quite good at identifying different types of data and sorting it into the proper place for analysis, but they’re often slowed down by documents that are damaged or difficult to read. Thanks to software integrations that provide robust image cleanup, document alignment, and form recognition tools, FinTech applications can ensure that firms are starting with the cleanest possible source data when extracting information for processing.

4. Pandemic Proofing

Although there are several promising COVID-19 vaccines on the horizon, challenges with supply and distribution will keep most companies operating under the same social distancing and remote workplace guidelines they put in place in 2020 for much of the year. Even if restrictions are lifted earlier than expected, the risk-averse financial industry will continue to think about how to avoid similar disruptions by implementing paperless processes and electronic data capture options. Just as retailers and manufacturers are rethinking their supply chain infrastructure, financial services companies must reassess their FinTech applications in light of recent challenges.

Developers can help the financial industry better “pandemic proof” their processes by integrating better document viewing, file conversion, and data capture tools into their software solutions. Not only can they automate traditionally time-consuming (and error-prone) manual data entry tasks, but they can also build in additional functionality to auto-generate data for new contracts and allow people to sign documents digitally to eliminate the need for face-to-face meetings. 

5. Banking Partnerships

Banks and other traditional financial institutions are increasingly partnering with FinTech startups to reach new customers and engage with existing clients over new channels. As Deloitte noted in a recent study, the pandemic has removed many of the obstacles to digital transformation in the financial industry and forced many established firms to pour tremendous resources into their tools and infrastructure. But as banks engage with innovative startups, they will need to find ways to integrate operations and data quickly to remain competitive and roll out new services successfully.

That integration process will be easier if they have flexible software solutions in place that can navigate multiple file types, perform cleanup and conversion, and extract essential data quickly and accurately. Whether they’re building that functionality into entirely new applications or integrating features into existing legacy systems, FinTech developers will play a key role in helping financial organizations accelerate their merger and partnership timetables so they can begin reaping the benefits more quickly. 

Solving Your FinTech Challenges with Accusoft

Accusoft’s collection of RESTful APIs and SDKs provide FinTech developers with the tools they need to build comprehensive content processing, conversion, and automation solutions into software applications. Whether you’re using PrizmDoc Suite to view, edit, and convert documents directly inside their financial applications, capturing valuable financial data from various form types with FormSuite for Structured Forms, or embedding powerful image cleanup, OCR, and annotation tools into your application with ImageGear, our family of software integrations allow you to add the functionality your FinTech solutions need to meet the challenges of 2021 and beyond.

To learn more about how our software tools can enhance your FinTech applications, talk to one of our integration experts today.

Although often considered a bit old fashioned, the insurance industry has made great strides in recent years to adapt to the changing needs of its customers. The latest generation of insurance customers expects faster service, better support, and more options from providers. Given these pressures, it’s no surprise that InsurTech developers have found ample opportunities to deliver solutions that help insurance firms better manage their workflows and create better customer experiences.

Despite the successes of this digital transformation, however, there are still a number of challenges that InsurTech developers face when building new applications. Investing heavily in creating powerful AI and big data tools might help those platforms stand out from the crowd, but they won’t find much success with firms if they don’t also provide the core functionality organizations need to service their customers. 

That’s why many InsurTech developers are turning to versatile SDK and API integrations to expand their feature sets without compromising their development timelines.

4 Major Challenges of InsurTech Applications

1. Security and Privacy

As the insurance industry continues to shift toward digital processes and platforms, it’s become more important than ever for InsurTech applications to keep sensitive data secure. While most organizations do invest in cybersecurity protections, they often don’t realize how their own practices could potentially pose a risk to customer information. This is especially true of insurers that rely on third-party programs for various tasks like document viewing and editing. Take, for instance, the case of Folksam Group, which inadvertently shared client data from as many as one million customers with Google, Facebook, LinkedIn, Microsoft, and Adobe in late 2020. 

2. File Management

Today’s insurers are receiving all kinds of documents, files, and images from their customers, which creates something of a document dilemma. A single auto accident claim, for instance, might have valuable information spread across multiple PDFs, Word documents, spreadsheet files, scanned images of hand-written forms, and image files. In order to process claims quickly and effectively, firms need InsurTech solutions that provide an all-in-one solution that can handle a broad array of file formats. Without these file management tools, insurers will be forced to use multiple programs to meet their needs, which creates inefficient dependencies and increases security risks.

3. Data Collection

Insurance companies gather quite a bit of information from form applications, both in physical and digital formats. Unfortunately, transferring that information from a form document into an InsurTech system is often a laborious manual process. Not only is manual data collection time consuming, it also increases the likelihood of human error. Even when firms do implement an InsurTech solution with forms processing capabilities, however, they often lack the capability to read certain types of form fields, especially those completed by hand. The ability to adapt to new form templates is also critical for organizations that want to invest in automation. 

4. Remote Collaboration

The COVID-19 pandemic may have forced insurance offices to rapidly embrace a remote work strategy, but many firms had already been investing in some form of hybrid work model for years. Nationwide was able to transition 98 percent of its workforce to remote status precisely because the company already had the technology solutions in place to allow insurance agents to work from home. Without some way of facilitating remote collaboration directly through InsurTech applications, organizations end up relying on email, which poses serious security concerns. Furthermore, with multiple copies of a document being distributed and downloaded, it quickly becomes difficult to know which version incorporates the most up-to-date changes.

SDK and API InsurTech Solutions

Building new functionality into an application always involves a tradeoff. When developers choose to code something from scratch, that means pulling team members away from another project or extending the product’s release timeline. In a fast-moving industry where InsurTech developers are racing competitors to be the first to market, it doesn’t make sense to design and build every aspect of an application in-house. 

Rather than pulling valuable development resources away from their innovative InsurTech features, developers can solve common insurance challenges much faster with SDK toolkits and API integrations. 

Secure File Viewing

The easiest way for InsurTech solutions to keep documents secure is to integrate HTML5 viewing capabilities directly into the application. Rather than being forced to download or open a file for viewing in a third-party application, employees can view multiple document formats natively. This is critical because it means no data will be shared with third-party programs.  Since the files remain safely within the secure InsurTech environment, firms can also control the level of access to any document, which prevents unauthorized individuals from downloading or viewing the contents. Thanks to API-based integrations like Accusoft’s PrizmDoc Viewer, InsurTech developers can help their applications safely view more than 100 unique file types without any third-party dependencies.

Data Capture

By integrating forms processing capabilities into their applications, InsurTech developers can provide their clients with powerful tools that allow them to gather essential data quickly and accurately. As the essential connective tissue between customers and insurance databases, form field recognition integrations use OCR technology to intelligently identify form data and extract it for processing. They can also be set up to identify a wide range of insurance forms to quickly identify and scan documents to streamline processing workflows. Accusoft’s FormSuite for Structured Forms even goes a step further by incorporating powerful image cleanup functionality to ensure that data will be extracted as accurately as possible.

File Conversion

In order to meet the file management challenges of today’s insurance providers, InsurTech developers need document and image processing integrations that can read and write multiple file formats. Information spread across multiple documents, emails, or even texts can be processed using OCR technology, and then consolidated and converted into a variety of formats for easy reference and collaboration. Rather than juggling several files with different dependencies, an SDK integration like Accusoft’s ImageGear can easily output processed files in PDF, RTF, XML, or DOCX format for viewing and editing within a single application.

Editing and Annotation

Providing secure document viewing capabilities solves only one half of the insurance collaboration challenge. InsurTech applications also need to provide both internal and external stakeholders with the ability to edit and markup documents throughout the application and claims process. Content processing integrations can allow authorized users to make changes to documents completely within their InsurTech solution and review markups and comments from other collaborators. 

Since all editing occurs within the application itself, there’s no need to worry about anyone downloading a document to make changes locally and creating confusion over which version is the most up-to-date. Redactions may also be necessary to hide private or confidential information from unauthorized viewers. As an added benefit, PrizmDoc Viewer’s editing features allow users to make a variety of markups and redactions while preserving the integrity of the original file.

Accelerate Your InsurTech Application Development with Accusoft

Accusoft’s collection of powerful SDK toolkits and API integrations provide innovative InsurTech developers with the resources they need to solve core insurance industry challenges. By implementing proven functionality into their applications, project managers can streamline the development process and dedicate more resources to the innovative features that will set their platform apart from the competition.

Whether you’re looking to incorporate versatile document viewing and editing or need a more accurate forms processing solution, Accusoft’s family of InsurTech SDKs and APIs can help your development team get to market faster. Learn more about what our products can do for your application in our InsurTech fact sheet.

 

Question

I am trying to perform OCR on a PDF created from a scanned document. I need to rasterize the PDF page before importing the page into the recognition engine. When rasterizing the PDF page I want to set the bit depth of the generated page to be equal to the bit depth of the embedded image so I may use better compression methods for 1-bit and 8-bit images.

ImGearPDFPage.DIB.BitDepth will always return 24 for the bit depth of a PDF. Is there a way to detect the bit depth based on the PDF’s embedded content?

Answer

To do this:

  1. Use the ImGearPDFPage.GetContent() function to get the elements stored in the PDF page.
  2. Then loop through these elements and check if they are of the type ImGearPDEImage.
  3. Convert the image to an ImGearPage and find it’s bit depth.
  4. Use the highest bit depth detected from the images as the bit depth when rasterizing the page.

The code below demonstrates how to do detect the bit depth of a PDF page for all pages in a PDF document, perform OCR, and save the output while using compression.

private static void Recognize(ImGearRecognition engine, string sourceFile, ImGearPDFDocument doc)
    {
        using (ImGearPDFDocument outDoc = new ImGearPDFDocument())
        {
            // Import pages
            foreach (ImGearPDFPage pdfPage in doc.Pages)
            {
                int highestBitDepth = 0;
                ImGearPDEContent pdeContent = pdfPage.GetContent();
                int contentLength = pdeContent.ElementCount;
                for (int i = 0; i < contentLength; i++)
                {
                    ImGearPDEElement el = pdeContent.GetElement(i);
                    if (el is ImGearPDEImage)
                    {
                        //create an imGearPage from the embedded image and find its bit depth
                        int bitDepth = (el as ImGearPDEImage).ToImGearPage().DIB.BitDepth; 
                        if (bitDepth > highestBitDepth)
                        {
                            highestBitDepth = bitDepth;
                        }
                    }
                }
                if(highestBitDepth == 0)
                {
                    //if no images found in document or the images are embedded deeper in containers we set to a default bitDepth of 24 to be safe
                    highestBitDepth = 24;
                }
                ImGearRasterPage rasterPage = pdfPage.Rasterize(highestBitDepth, 200, 200);
                using (ImGearRecPage recogPage = engine.ImportPage(rasterPage))
                {
                    recogPage.Image.Preprocess();
                    recogPage.Recognize();
                    ImGearRecPDFOutputOptions options = new ImGearRecPDFOutputOptions() { VisibleImage = true, VisibleText = false, OptimizeForPdfa = true, ImageCompression = ImGearCompressions.AUTO, UseUnicodeText = false };
                    recogPage.CreatePDFPage(outDoc, options);
                }
            }
            outDoc.SaveCompressed(sourceFile + ".result.pdf");
        }
    }

For the compression type, I would recommend setting it to AUTO. AUTO will set the compression type depending on the image’s bit depth. The compression types that AUTO uses for each bit depth are: 

  • 1 Bit Per Pixel – ImGearCompressions.CCITT_G4
  • 8 Bits Per Pixel – ImGearCompressions.DEFLATE
  • 24 Bits Per Pixel – ImGearCompressions.JPEG

Disclaimer: This may not work for all PDF documents due to some PDF’s structure. If you’re unfamiliar with how PDF content is structured, we have an explanation in our documentation. The above implementation of this only checks one layer into the PDF, so if there were containers that had images embedded in them, then it will not detect them.

However, this should work for documents created by scanners, as the scanned image should be embedded in the first PDF layer. If you have more complex documents, you could write a recursive function that goes through the layers of the PDF to find the images.

The above code will set the bit depth to 24 if it wasn’t able to detect any images in the first layer, just to be on the safe side.