Technical FAQs

Question

What file types are supported by Accusoft PDF Viewer?

Answer

The viewer currently supports only PDF file formats based on the PDF32000 specification. If you need more wide ranging document support our PrizmDoc Server platform can help!

Question

Where can I find the documentation for Accusoft PDF Viewer?

Answer

Our product documentation is located here.

convert pdf

PDFs are everywhere. Vice calls them “the world’s most important file format,” and that’s not far off the mark. The sheer number of documents converted to, from, and often back to PDFs is astounding. The hard truth? They’re also frustrating to work with. Start a Google search with the word “convert” and three of the top five results involve PDFs. 

While this portable document format lives up to its namesake by making it easy for users to attach and send documents across their organizations, PDFs often run into problems when it comes to conversion, collaboration, and communication. While many tools offer piecemeal PDF functionality, they lack a complete cadre of critical capabilities, in turn forcing software engineers to use multiple software solutions for seemingly simple tasks. 

ImageGear offers a different take on the standard software development kit (SDK) designed to help developers maximize their PDF potential. Here’s how it works. 


The Value of PDF Conversion

While PDF conversion is one of the top sought-after functionalities, there’s another area that’s often overlooked: modifying the characteristics of PDFs on-screen. With companies now handling PDFs from multiple sources that may include everything from computer-generated form data to handwritten information and images, it’s no surprise that staff encounter a wide variety of viewing issues.

ImageGear PDF helps solve these problems by allowing users to call the shots on PDF content at scale with features such as:

  • Conversion
  • Metadata Management
  • Content and Font Editing
  • Text Extraction
  • PDF Watermarking
  • Container, Dictionary, and Layer Creation
  • 3D Asset Modification

ImageGear PDF also helps improve document processing with document cleanup and advanced optical character recognition (OCR). With the ability to encrypt and decrypt entire images (or part of an image), automatic ImageClean correction of white text blocks, borders, and inverted images, plus intelligent re-sizing, any PDF can be cleaned and made more readable for the user. 

OCR support for almost any document type is also a benefit. This includes those produced on typewriters, dot-matrix printers, ink-jet printers, laser printers, and photocopied, scanned, and faxed documents. ImageGear PDF helps users control and customize multiple PDF variables, making it a fully functional PDF conversion solution for your application.


PDF Pain Points

One of the biggest PDF frustrations? The inability to break apart and combine PDF documents. Let’s imagine you have a massive legal PDF or in-depth medical file. In these circumstances, professionals only need a portion of the PDF, but without the right tools they’re stuck sending entire files when all they need is a single page. In other cases, employees might have a host of related PDFs that are part of the same project, but can’t be easily combined to save space and time.

ImageGear PDF has you covered with the ability to easily delete or insert PDF pages, render pages in a single PDF, split a PDF, merge two or more PDFs into a single file, or even merge specific pages from two or more PDFs into a single PDF. This not only makes a massive difference in time spent working with PDF documents, it helps reduce unnecessary storage and transmission of multiple files. 


Convert PDF: Multiple File Formats for Conversion

Conversion is critical for PDF success. Instead of creating complexity by forcing end-users to stick with original file formats, implementing an SDK with cutting-edge conversion empowers corporate consistency and saves on storage space. ImageGear PDF supports a host of common file formats for conversion including Microsoft Office, JPEG 2000, CAD, and SVG.

Of course, no feature forward PDF framework is complete without robust annotation, redaction, and commenting capabilities. These features make it easy for other users to see exactly what’s been changed, when, and why, along with providing a critical, auditable paper trail to meet evolving compliance and regulatory standards.


PDF Functionality for Your Application

Best of all, ImageGear isn’t designed to replace your current software, but integrate alongside existing workflows. Rather than adding another application to already-overloaded IT arsenals, straightforward SDK integration means everything happens within your own application, making it easy for everyone to find exactly what they’re looking for within familiar territory. Need help jumpstarting your SDK deployment? Check out our full list of ImageGear .NET samples for ASP.NET, CAD, OCR support, and more.

PDFs remain eternally popular and continually frustrating. Solve for document viewing, split and merge, and conversion issues and streamline employee efforts with ImageGear.

document conversion

Not all file formats are created equal. Some — like the .docx files produced by the ever-popular Microsoft Word — are ideal for creating and editing text-based documents, while others offer the high resolution necessary for medical images or the security required for legal case files.

Challenges emerge, however, when businesses need the same information, but require a different file format. Recreating the document or image from scratch is a waste of time and resources, while leveraging free online programs to make the switch introduces potential security risks. As noted by 9to5Mac, 23 file conversion apps for iOS were recently found to completely lack encryption, putting both information and organizations at risk. Companies need to simplify the switch with robust document conversion solutions capable of delivering both speed and security.


Scale of the Switch

A quick Google search for the phrase “convert to PDF” turns up more than 3 billion search results. It makes sense. PDF documents can be easily password protected and converted to read-only, making them ideal for data companies that need to share, but don’t want data modified. 

As noted above, Office files such as .docx remain common for business use along with other Office staples such as .xls and .ppt, but businesses are regularly tasked with converting other file types — often sent by customers or suppliers — into Microsoft-friendly formats.

The result is a landscape full of “free” tools that are long on document conversion promises but short on details about what’s supported, how conversion takes place, and who has access to your data. Given the scale of document conversion requests, the use of free tools can bridge functional gaps, even as they create more distance between documents and key defensive measures. 

Application switching is also a challenge. Since most free tools convert only a subset of file types, users may need to navigate multiple apps and conversion steps for a single file. As noted by Forbes, this continual app switching can waste up to 32 days worth of productivity per year.


Speaking the Same Language

Accusoft’s ImageGear SDK solves the conversion challenge by putting more than 100 file types under one digital roof. Some of the most popular conversion processes include:

  • Microsoft Office ImageGear offers support for Word, Excel, Powerpoint, JPG, and more with enhanced rendering for near-native Office support.
  • CAD Convert AutoCAD files such as DWG, DXF, or DGN to PDF, JPEG, and SVG. CAD conversion supports both 2D and 3D images along with changes in light source, layers, and perspective.
  • Adobe/PDFAs noted above, “convert to PDF” is one of the web’s most popular searches. Easily convert to and from EPS, PDF, or PDF/A with ImageGear’s comprehensive PDF API.
  • Raster Images Edit, compress, and annotate dozens of raster files including TIFF, JPEG, PNG, PSD, RAW, and PDF.
  • Medical Images Part of the ImageGear Collection, ImageGear Medical preserves medical image consistency and quality with conversion to and from DICOM, JPEG 2000, and other popular file types. ImageGear Medical also includes full DICOM metadata support.
  • Vector Images Dozens of vector images including SVG, EPS, PDF, and DXF can be easily converted with ImageGear.

Find the full list of supported file types here.


Security by Design

Data security matters. From legal firms to financial institutions, the reputational risks and regulatory penalties facing companies that don’t secure data by default are on the rise. The ability to quickly and seamlessly convert files from editable to read-only formats both enhances document security and improves overall defense. 

The easiest way to achieve this goal? Integrated, in-app file conversion. 

By removing the external risk of third-party apps and leveraging advanced SDKs that integrate into your own secure software, organizations can protect both the process of document conversion and deploy the annotations, permissions, and redactions necessary to keep documents safe. Simplify the switch. Deliver in-app, secure document conversion on-demand with ImageGear.

 

After years of discussion and debate over the state of digital transformation in the legal field, 2020 delivered something of an ultimatum to an industry that has proven historically resistant to drastic change. The COVID-19 pandemic profoundly altered the way many law firms do business, forcing them to seek out a variety of LegalTech solutions to survive in a new environment. Many of these changes are likely to remain firmly entrenched in the coming years, so it’s worth taking a look back at the factors driving them.

COVID-19 and Change in the Legal Industry

From an outsider’s perspective, the legal industry might have appeared to be uniquely well-suited to adapt to the pandemic. Lawyers are high-skill workers with an extensive range of technology solutions at their fingertips to facilitate remote work. It’s easy to imagine a scenario in which many aspects of the legal process, from client intake to discovery to filing documents with the court, are handled virtually, without anyone needing to step foot outside their home office. 

The reality, unfortunately, isn’t so simple. While it’s true that there are several innovative tools available that could support remote work, the legal industry has long struggled to adopt them at scale. Part of that has to do with the culture of law firms themselves, which tend to be driven by a traditional business model that hasn’t changed much since the 20th century. 

Although the legal industry has benefited from technology throughout its history, the use of that technology has typically fallen not to the lawyers themselves, but to their support staff. From printing out reams and reams of documents to manually tracking time in minute-based increments, many lawyers cling to outdated and inefficient practices out of habit and aversion to change.

Although the Great Recession caused some disruption to the legal industry, the impact was not significant or lasting enough to make firms fundamentally rethink their billing and technology usage. That has changed in 2020. As the industry struggles to adapt to the realities of the pandemic, firms have been forced to engage in what Jennifer Leonard, Chief Innovation Officer for University of Pennsylvania’s Carey Law School, describes as “forced experimentation.” This includes implementing technologies already quite common in other industries, such as video conferencing tools and cloud-based collaboration software, as well as taking a more customer-centric approach to delivering legal services.

Key LegalTech Trends in 2020

The rapid transition to the remote workplace has forced legal firms to implement several years’ worth of technological change into the span of a few short months. Here are a few key LegalTech trends and needs that defined the industry in 2020.

Secure Online Communication

Successful transition to a remote work environment requires the right software tools to facilitate secure communication and collaboration. Lawyers not only need to be able to stay in direct contact with clients and colleagues, but also with the court system itself. With many judicial offices shuttered during the early months of the pandemic, courts have greatly expanded their use of e-filing, e-service, and online dispute resolution software. Various video conferencing platforms have also made it possible to conduct court hearings remotely. In a historic move, even the US Supreme Court chose to hear arguments over telephone.

With so many lawyers working remotely, however, security has become more important than ever. That’s because home networks and personal devices can present a variety of security risks. Sharing documents over unencrypted email rather than through more secure LegalTech applications could potentially compromise secure client information or legal strategies. That has driven firms to implement digital solutions that they might have been hesitant to adopt as recently as a year ago.

Online Legal Research

The research and discovery process has gradually been moving online for quite some time. According to research by the American Bar Association (ABA), nearly 70% of lawyers begin their legal research with a general search engine or paid online resource. All of that online research means that lawyers need to be able to securely access and convert multiple different file types. While many legal documents can be found in various online databases, they often exist in poorly scanned formats that are difficult to read or otherwise manipulate. In order to manage these documents effectively, firms need LegalTech applications with imaging and conversion tools that can perform image cleanup and then convert files into formats that are easier to work with.

Virtual Document Review

Whether they’re negotiating contracts or reviewing information as part of discovery, lawyers need to be able to annotate and redact documents without creating confusion over which edits are the most up-to-date. Version control has long been a challenge for the industry, whether it was multiple people working from different printed copies of a document or everyone having their own copy downloaded to a separate device. It’s no surprise, then, that LegalTech startups specializing in contract review software have had no difficulty finding investors during the pandemic. To meet the growing needs of remote legal firms, these platforms will need to deliver powerful editing and access control features that allow users to collaborate more efficiently.

Innovative Billing Strategies

Although law firms have historically weathered economic downturns better than the rest of the economy, the unique nature of the COVID-19 pandemic hit the industry hard in the first half of 2020. According to data gathered by Clio, billing and case volumes plunged in March and April before starting a slow recovery in May. That recovery has been uneven, however, punctuated by a few sharp declines even as overall caseloads return to baseline levels. Firms frequently responded by laying off staff, with 20% of firms having done so or expecting to as recently as July.

The pandemic has forced many firms to implement timekeeping and billing software to help improve efficiency and deliver more value-based services to their clients. Traditional billable hour approaches tended to discourage efficiency, so shifting to a more flexible and transparent system driven by digital tools can help provide firms with the flexibility they need to meet client needs under adverse conditions. Automating billing also allows legal teams to focus more on acquiring new clients and retaining existing clients.

More Changes Coming in 2021

Several legal industry trends from 2020 are expected to continue, or even accelerate, in 2021. Here are just a few areas that will likely remain key priorities for LegalTech developers seeking to meet the industry’s needs.

  • Improving the Client Experience: With so much of the attorney-client relationship going remote, legal firms will need to continue investing in tools that allow them to communicate and interact with their customers more easily.
  • More Cloud Adoption: Legal firms have been slow to adopt cloud-based LegalTech applications, but the pandemic has demonstrated the value of being able to access essential data and tools from anywhere at any time.
  • Organizational Innovation: As LegalTech becomes more essential, law firms will likely continue to rethink their organizational structure by adding non-legal staff to drive digital transformation.

Unlock Your LegalTech Potential with Accusoft

Developing robust LegalTech platforms that help firms overcome the challenges of the remote workplace is a major challenge. Accusoft’s collection of content processing and conversion solutions allow development teams to easily integrate the collaboration and information-sharing tools lawyers require into your applications. Whether you’re incorporating our REST APIs or powerful SDKs, we provide the functionality your software needs so your team can focus on the innovative features that will set you apart in the crowded LegalTech market in 2021 and beyond.

To learn more about how our content solutions can enhance your legal applications, talk to one of our integration experts today.

In today’s world, brands must focus on their consumers’ experience in order to retain business. While some companies hire market researchers to go out and collect data, Passenger recognized the need for a simpler, more cost-efficient way to gain insight.

The company created Fuel Cycle, an online community platform to help the world’s leading brands engage with current and future customers and stakeholders. The platform allows users to comment on content to give the company better insight on consumers’ needs and desires.

Between 2013 and 2014, Fuel Cycle was in transition. The platform needed a document solution that would enable users to upload and share files within their private communities. They knew their customers would benefit from viewing the documents directly in the platform, since each private community had between five and 50,000 members that needed to be able to give feedback on a variety of different files.

Fuel Cycle Challenges

The Chief Product Officer, VP of Engineering, and Lead Architect set out to find an overarching solution that would meet the needs of their vast community base.

After doing some research into several solutions, the company decided on Accusoft’s PrizmDoc cloud-hosted file viewer due to its file versatility and overall scalability.

“We looked at some self-hosted Java libraries, and also hosting through Google Docs. Nothing was quite right. We went with PrizmDoc’s cloud solution. That way, we don’t have to worry about software updates,” confirmed Kevin Owens, Chief Product Officer.

Accusoft’s Solutions

Within the Fuel Cycle community, brands receive feedback on specific content. For example, brand moderators will upload files in the form of PDFs or Word documents. Community members will comment on these files to share their opinions on the product, advertisement, or promotion. “We use PrizmDoc to actually display all the documents in the system. It works on mobile as well,” says Owens.

Fuel Cycle’s biggest priorities are global scalability and added customer support. With a wide range of clients, a variety of files are shared within the community. PrizmDoc helps the platform reach these clients by providing a document viewer with a wide range of file format functionality.

The Community Grows With PrizmDoc

In the past year, Fuel Cycle used PrizmDoc to enhance their document sharing capabilities, which boosted their customer base. The company relies on PrizmDoc to provide Fortune 500 clients like Mastercard, Hertz, and AIG with the solutions that enable the community to flourish.

Now that Fuel Cycle is using PrizmDoc, the company has grown exponentially, since the document viewer was the solution to many customer requests. The company finds peace of mind in the lack of worry and development time that came with their investment in PrizmDoc. Currently, the company is operating at a Business Elite annual level of 6,000 document uploads per month.

To learn more, download the full case study.

Roderick McMullen, Accusoft Software Engineer III

Earlier this year, I was tasked to prepare a JIRA epic and issues to improve continuous integration for ImageGear under our existing build infrastructure. Only after creating the epic’s story and several dozen issues did I notice a mistake in the boilerplate description used to create each issue. The next hour was spent using the JIRA web interface to correct each story to add the missing text.

Per JIRA documentation, bulk change of JIRA story descriptions is not available. However, JIRA’s REST API can be leveraged to script description updates to one or more issues, provided that authentication with the JIRA server succeeds. For each JIRA issue specified, retrieve its current description, apply a Python regular expression search and replace, and update the issue’s description on the server.

Following the JIRA REST API tutorials, I prepared a Python 2.7 script that accepts as command line options:

  • login name and password.
  • a comma-separated list of JIRA issue keys.
  • a regular expression pattern to match.
  • replacement text.

After parsing the command line arguments with the Python argparse module, the script uses Python urllib2 to obtain the JIRA session cookie with an HTTP POST request, containing a username and password, packed into a JSON message, to endpoint rest/auth/1/session. The request should contain a header to identify “content-type” as “application/json”. When successful, the HTTP POST response contains the JIRA session cookie, packed in a JSON message, that must accompany subsequent server requests. Unpack the JSON response into a Python dict using the Python json.loads() function and retrieve the dict path “session/value” to identify the JIRA session cookie.

def reauthenticate(username, password, server='https://localhost:8090/jira'):
   url = u'{0}/rest/auth/1/session'.format(server)
   data = {'username':username, 'password': password}
   r = urllib2.Request(url,json.dumps(data))
   r.add_header('Content-Type', 'application/json')
   f = urllib2.urlopen(r)
   try:
       return json.loads((f.read()))['session']['value']
   finally:
       f.close()

To get an issue’s current description, send an HTTP GET request to endpoint rest/api/2/issue/{issuekey}, where {issuekey} identifies the JIRA issue to retrieve. When successful, the HTTP GET response contains the JIRA issue’s property values. Unpack the JSON response into a Python dict using the json.loads() function and retrieve the dict path “fields/description” to identify the issue description.

def get_jira_issue(jsessionid, issuekey, server='https://localhost:8090/jira'):
   url = u'{0}/rest/api/2/issue/{1}'.format(server, urllib2.quote(issuekey))
   opener = urllib2.build_opener()
   opener.addheaders.append((u'Cookie', u'JSESSIONID={0}'.format(jsessionid)))
   f = opener.open(url)
   try:
       return json.loads((f.read()))
   finally:
       f.close()

To set an issue’s details, send an HTTP PUT request to endpoint rest/api/2/issue/{issuekey}, where {issuekey} identifies the JIRA issue to edit. The request data is a JSON message that indicates new property values. Create a Python dict with dict path “fields/description” value equal to the new description. Pack the dict into a JSON message using the json.dumps() function.

def set_jira_issue_description(jsessionid, issuekey, description,
                              server='https://localhost:8090/jira'):
   url = u'{0}/rest/api/2/issue/{1}'.format(server, urllib2.quote(issuekey))
   data = {u'fields':{u'description':description}}
   r = urllib2.Request(url,json.dumps(data))
   r.add_header('Content-Type', 'application/json')
   r.add_header('Cookie', 'JSESSIONID={0}'.format(jsessionid))
   r.get_method = lambda:'PUT'
   f = urllib2.urlopen(r)
   try:
       return
   finally:
       f.close()

I favor the Python re module re.sub() function for text substitutions. For simplicity, the script accepts a single find-replace pair. Prior to each issue update, a difference report is generated with the Python difflib module and printed to stdout. An interactive version of this script could block at this stage, prompting the operator to accept or skip the update after review.

def preview_update(issuekey, before, after):
   d = difflib.unified_diff(before.splitlines(), after.splitlines(),
       u'{0} Description (original)'.format(issuekey),
       u'{0} Description (modified)'.format(issuekey), n=2, lineterm=u'n')
   sys.stdout.write(u'n'.join(list(d))+'n')
 
def preview_and_accept_update(issuekey, before, after):
   preview_update(issuekey, before, after)
   return True
 
def preview_and_reject_update(issuekey, before, after):
   preview_update(issuekey, before, after)
   return False
 
def bulk_update_description(jsessionid, issuekeys, regex, repl,
                           server='https://localhost:8090/jira',
                           confirmUpdateProc=preview_and_accept_update):
   for issuekey in issuekeys:
       issuekey = issuekey.strip()
       before = get_jira_issue(jsessionid,issuekey,server)['fields']['description']
       after = regex.sub(repl, before)
       if confirmUpdateProc(issuekey, before, after):
           set_jira_issue_description(jsessionid,issuekey,after,server)

The final Python 2.7 script is included below.

#!/usr/bin/env python
# -*- coding: utf-8 -*-
COPYRIGHT   = 'Copyright Accusoft Corporation'
PROG        = 'jira-find-replace.py'
VERSION     = '1.0.0'
DESCRIPTION = """For each JIRA issue specified, retrieve its description,
apply a regular expression substitution, then update its description on the 
server.

Example: Replace occurrences of "a blog post" to "an article" in JIRA issue ABC-001:
    python jira-find-replace.py -n https://jira.organization.com -i "ABC-001" -f "a blog post" -r "an article" -u jdoe
    python jira-find-replace.py -n https://jira.organization.com -i "ABC-001" -f "a blog post" -r "an article" -u jdoe -p password
    python jira-find-replace.py -n https://jira.organization.com -i "ABC-001" -f "a blog post" -r "an article" -j DEADBEEFDEADBEEFDEADBEEFDEADBEEF --preview-only

"""

import argparse
import codecs
import sys
import urllib2
import json
import getpass
import re
import difflib

DEFAULT_JIRA_SERVER = u'https://localhost:8090/jira'

def reauthenticate(username, password, server=DEFAULT_JIRA_SERVER):
    """ Create a new session and return the session cookie.
    
    Args:
        username (str): The JIRA username for authentication.
        password (str): The JIRA password for authentication.
        server (str, optional): The URL to the JIRA server. Default value
            is DEFAULT_JIRA_SERVER.
    
    Returns:
        Str: Returns the JSESSIONID cookie retrieved from the server response.
    
    Raises:
        HTTPError. Raises HTTPError for authentication errors.
    
    """
    url = u'{0}/rest/auth/1/session'.format(server)
    data = {'username':username, 'password': password}
    r = urllib2.Request(url,json.dumps(data))
    r.add_header('Content-Type', 'application/json')
    f = urllib2.urlopen(r)
    try:
        return json.loads((f.read()))['session']['value']
    finally:
        f.close()

def get_jira_issue(jsessionid, issuekey, server=DEFAULT_JIRA_SERVER):
    """Get details for a JIRA issue.
        
    Args:
        jsessionid (str): The JIRA session ID cookie assigned after
            authenticating with the JIRA server.
        issuekey (str): The JIRA issue to return. e.g. XYZ-200.
        server (str, optional): The URL to the JIRA server. Default value
            is DEFAULT_JIRA_SERVER.
    
    Returns:
        Dict: JIRA issuekey details.
    
    Raises:
        HTTPError. Raises HTTPError for communication errors.
    
    """
    url = u'{0}/rest/api/2/issue/{1}'.format(server, urllib2.quote(issuekey))
    opener = urllib2.build_opener()
    opener.addheaders.append((u'Cookie', u'JSESSIONID={0}'.format(jsessionid)))
    f = opener.open(url)
    try:
        return json.loads((f.read()))
    finally:
        f.close()

def set_jira_issue_description(jsessionid, issuekey, description, 
                               server=DEFAULT_JIRA_SERVER):
    """Set description for a JIRA issue.
        
    Args:
        jsessionid (str): The JIRA session ID cookie assigned after logging 
            into the JIRA server.
        issuekey (str): The JIRA issue to return. e.g. XYZ-200.
        description (str): The new description.
        server (str, optional): The URL to the JIRA server. Default value
            is DEFAULT_JIRA_SERVER.
    
    Returns:
        Dict: JIRA issue details.
    
    Raises:
        HTTPError. Raises HTTPError for communication errors.
    
    """
    url = u'{0}/rest/api/2/issue/{1}'.format(server, urllib2.quote(issuekey))
    data = {u'fields':{u'description':description}}
    r = urllib2.Request(url,json.dumps(data))
    r.add_header('Content-Type', 'application/json')
    r.add_header('Cookie', 'JSESSIONID={0}'.format(jsessionid))
    r.get_method = lambda:'PUT' 
    f = urllib2.urlopen(r)
    try:
        return
    finally:
        f.close()

def preview_update(issuekey, before, after):
    """ Prints a diff report for the description change.
    
    Args:
        issuekey (str): The JIRA issue to return. e.g. XYZ-200.
        before (str): The issue description before susbstitutions are applied.
        after (str): The issue description after susbstitutions are applied.
    
    Returns:
        None
    
    """
    d = difflib.unified_diff(before.splitlines(), after.splitlines(), 
        u'{0} Description (original)'.format(issuekey), 
        u'{0} Description (modified)'.format(issuekey), n=2, lineterm=u'n')
    sys.stdout.write(u'n'.join(list(d))+'n')

def preview_and_accept_update(issuekey, before, after):
    """ Prints a diff report and accept update.
    
    Args:
        issuekey (str): The JIRA issue to return. e.g. XYZ-200.
        before (str): The issue description before susbstitutions are applied.
        after (str): The issue description after susbstitutions are applied.
    
    Returns:
        None
    
    """
    preview_update(issuekey, before, after)
    return True

def preview_and_reject_update(issuekey, before, after):
    """ Prints a diff report and reject update.
    
    Args:
        issuekey (str): The JIRA issue to return. e.g. XYZ-200.
        before (str): The issue description before susbstitutions are applied.
        after (str): The issue description after susbstitutions are applied.
    
    Returns:
        None
    
    """
    preview_update(issuekey, before, after)
    return False

def bulk_update_description(jsessionid, issuekeys, regex, repl, 
                            server=DEFAULT_JIRA_SERVER, 
                            confirmUpdateProc=preview_and_accept_update):
    """ Update list of JIRA issue specified, retrieve the its description, apply a
    regular expression substitution, then update its description on the server.
    
    Args:
        jsessionid (str): The JIRA session ID cookie assigned after logging 
            into the JIRA server.
        issuekeys (list): List of JIRA issues to modify. e.g. [XYZ-200, XYZ-201].
        regex (re.regex): A precompiled regular expression object to perform
            substitutions.
        repl (str): The replacement text.
        server (str, optional): The URL to the JIRA server. Default value
            is DEFAULT_JIRA_SERVER.
        confirmUpdateProc (function, optional): Callback function that
            returns True to accept update; returns False to skip update.
            Default value is print_diff_and_accept_update.
    
    Returns:
        None
    
    Raises:
        HTTPError. Raises HTTPError for communication errors.
    
    """
    for issuekey in issuekeys:
        issuekey = issuekey.strip()
        before = get_jira_issue(jsessionid,issuekey,server)['fields']['description']
        after = regex.sub(repl, before)
        if confirmUpdateProc(issuekey, before, after):
            set_jira_issue_description(jsessionid,issuekey,after,server)

if __name__ == u'__main__':

    # Parse arguments.
    parser = argparse.ArgumentParser(description=DESCRIPTION, prog=PROG, conflict_handler=u'resolve', formatter_class=argparse.RawTextHelpFormatter)
    parser.add_argument(u'--version', action='version', version=u'%(PROG)s %(VERSION)s, %(COPYRIGHT)s'%{u'PROG':PROG, u'VERSION':VERSION, u'COPYRIGHT':COPYRIGHT})
    parser.add_argument(u'-n', u'--server', dest='jiraBaseUrl', default=DEFAULT_JIRA_SERVER, help=u'The JIRA server base URL.nDefault value is {0}.'.format(DEFAULT_JIRA_SERVER))
    parser.add_argument(u'-j', u'--jsessionid', dest='jiraSessionId', default=None, help=u'The JIRA session ID cookie assigned after a successful login.')
    parser.add_argument(u'-u', u'--user', dest='jiraUser', default=None, help=u'The JIRA login username.nDefault is None.')
    parser.add_argument(u'-p', u'--password', dest='jiraPassword', default=None, help=u'The JIRA login password.nDefault is None.')
    parser.add_argument(u'-i', u'--issuekeys', dest='jiraIssueKeys', default=None, help=u'A comma-separated list of JIRA issues to modify. e.g. "ABC-001, ABC-002"', required=True)
    parser.add_argument(u'-f', u'--find', dest='find', default='', help='A regular expression pattern to match.', required=True)
    parser.add_argument(u'-r', u'--replace', dest='replace', default='', help='The substitution text to replace regular expression matches.', required=True)
    parser.add_argument(u'-d', u'--preview-only', action='store_true', dest='previewOnly', default=False, help='Preview modifications, but do not commit.nDefault is False.')
    args = parser.parse_args()
    
    if args.jiraSessionId is None:
        if args.jiraUser is not None:
            if args.jiraPassword is None:
                args.jiraPassword = getpass.getpass('password: ')
            args.jiraSessionId = reauthenticate(username=args.jiraUser,
                                                password=args.jiraPassword,
                                                server=args.jiraBaseUrl)
    
    if True == args.previewOnly:
        bulk_update_description(jsessionid=args.jiraSessionId,
                                issuekeys=args.jiraIssueKeys.split(','),
                                regex=re.compile(args.find),
                                repl=args.replace,
                                server=args.jiraBaseUrl,
                                confirmUpdateProc=preview_and_reject_update)
    else:
        bulk_update_description(jsessionid=args.jiraSessionId,
                                issuekeys=args.jiraIssueKeys.split(','),
                                regex=re.compile(args.find),
                                repl=args.replace,
                                server=args.jiraBaseUrl,
                                confirmUpdateProc=preview_and_accept_update)
    
    exit(0)

Roderick McMullen is a Software Engineer III in the SDK division. He joined the company in 2004 as a Software Engineer in Support. Roderick graduated with a Bachelor of Science in Computer Engineering from the University of Florida.

 

image processing SDK

Image processing is now a priority across industry lines. From legal firms to financial institutions to health organizations the ability to capture, convert, and combine documents on-demand often makes the difference between hitting project deadlines and falling behind.

As image formats outpace the ability of legacy solutions to manage and manipulate, however, a new challenge emerges. Companies need conversion, document management, and image cleanup software capable of handling multiple file types, but are they better served building their own systems or buying software solutions to help them bridge the gap? Let’s go head-to-head and see which potential processing option comes out on top. 


Round One: Targeting Consistency

Ask companies why they prefer to build their own software solutions and the answer is invariably the same, control. The work of creating new functionality from scratch is often paired with the notion of end-to-end control; since in-house developers built the image processing program they’re equipped to handle any emerging security or performance challenges.

The problem? In a world where robust digital solutions are the expectation rather than the exception, speed and consistency are the image-processing benchmarks. Staff need to know that when they go looking for image conversion and document management options, they’ll always find exactly what they’re looking for — and it will always perform as expected. 

In-house options that require regular maintenance and security updates can’t match this level of accessibility; ensuring optimal performance demands regular downtime to both implement planned updates and deal with potential problems as they occur. Fully-supported, purpose-built processing solutions, meanwhile, deliver consistent results and common functionality on-demand.

 


Round Two: Talking Conversion

The biggest benefit of image processing software? Conversion. The ability to intake documents and easily modify their format, adjust properties, or add essential changes. Here, building your own image processing engine comes with the benefit of specificity. If you’re dealing primarily with PDF files, create a small-scale PDF library capable of handling PDFs and turn it loose across internal networks.

Here’s where things get tricky. While introducing a new, purpose-built application solves one problem, it also creates another: app overload. As noted by recent workplace research, almost 70 percent of workers already lose up to 60 minutes per day navigating between different software solutions. Adding a new in-house tool lets them avoid searching online for a functional best-fit but also adds another app to their list and increases their total time wasted. On the developer side, building comes with the ongoing time and resource commitments necessary to create and support multiple imaging libraries — and keep up with the ongoing evolution of new image file formats.

Image processing software development kits (SDKs), meanwhile, come with conversion abilities across a host of file types. Even better? These tools integrate with existing solutions, meaning your team gets the advantage of easy image conversion without the added complication of constantly switching apps.

 


Round Three: Taking the Shortcut

There’s an understandable pride that comes with building apps from the ground up. In many respects, buying a software engine seems like taking a shortcut. But here’s the thing, shortcuts are faster. Even if you were designing an app from scratch, your developers would search popular code repositories to avoid repeating work someone else has already done. After all, if a great image processing tool already exists, why build another? 

Image processing SDKs simply scale up the scope of common code usage to streamline your document management, conversion, and image cleanup processes. As noted by DZone, there’s also a case here for compatibility; by laying customizable software engines on top of existing applications, you ensure that desktop, mobile, and even remote users all have access to the same functionality.

Building your own image processing program is entirely possible if you like heavy lifting, enjoy total control, and hate taking shortcuts. However, buying a full-featured engine capable of handling multiple file types across any enterprise endpoints is the ideal approach if you’re looking for ease of integration, consistent compliance outcomes, and company-wide compatibility. Learn more about ImageGear and all of its capabilities here.

Gerry Hernandez, Accusoft Senior Software Engineer

This is a continuation of our series of blog posts that share our experience with functional test automation in a real-world microservice product base. In part two, we will share our philosophical approach to SURGE: Simulate User Requirements Good-Enough. Be sure to read part one before getting started.

 

The SURGE Methodology

Much thought went into deliberating why we think we ran into the problems discussed in the first part of this series. Immediately, we stopped ourselves and realized that we needed to stop relying on theory and jump straight into practice. After all, on paper, Cucumber sounded like a silver bullet until we tested it with our products. So here’s where we landed.

 

Prototype Everything

Every single design decision in the SURGE methodology, and in turn, our Node implementation of our framework, was prototyped and tested in real-world scenarios with real-world code. We know that not all code is perfect; technical debt exists everywhere. SURGE works well with theoretically optimal (i.e. fictional) codebases, but it also has zero problems with dirty applications that were a result of not enough coffee on a Monday morning. This is the reality we face as software engineers and QA analysts alike, so we feel the methodology should be centered around imperfect situations.

With this philosophy, we found both Node and Python to be very suitable languages, as each one is a borderline RAD tool, if you compare it to broader languages such as C++ and Java. But to be extremely clear, SURGE is just a set of patterns and practices; any set of technologies may be suitable for implementation. The cloud services team ended up picking Node because it was quick, easy, and fun.

 

Behavior is Contextual

Humans are good at communicating because we’re social beings. We can give each other simple instructions and follow the spirit of the words, as opposed to the literal meaning. Computers are utter morons when it comes to natural language, so let’s not try to make them something they’re not. Well, at least not for our functional test suites!

So we thought about this. The reason a person can understand the two different example Gherkin features given in part one is because they understand that each one has its own context, its own meaning, and its own vocabulary. This is very important when you have a product with a wide range of capabilities, from reading barcodes to document storage and workflows. For instance, the word “scanning” has two completely different meanings when discussing a barcode versus, for example, a sheet of paper. We want to maintain this philosophy, and in turn, we urge our developers to write natural scenarios that make sense to a human, as opposed to making sense to a Gherkin parsing engine.

What we end up with is Gherkin being coupled directly to step functions, as opposed to magically matched by a parsing engine. This means that Gherkin statements can be repeated independently in separate areas of functionality of the test suite without ever colliding. We believe this to be the most critical difference between traditional BDD and SURGE.

 

Tests Are Inherently Stateful

When making a peanut butter and jelly sandwich, you would put the knife in the jelly after removing the lid. This implies that you are already aware that the lid has been removed. Not only that, but you must also be aware that you specifically removed the lid to the jar of jelly, rather than the peanut butter. Otherwise, you may end up with glass shards on your PB&J, which is not desirable. These same implications are shared with functional tests.

Functional tests, whether they’re for acceptance or regression, follow a set of ordered steps. Each step either mutates the state of the test or verifies the state against an expectation. Through experimentation, we discovered that traditional BDD test execution makes it very hard to comprehend state, since every step is global and may be invoked in any arbitrary order, from any arbitrary scenario, from any arbitrary feature. This is what leads to the cyclomatic complexity issue described earlier in part one.

With this in mind, we wanted SURGE to promote a very simple, lightweight way of maintaining state within a functional test context that is isolated from shared code. That is, no shared code should ever depend on or mutate state directly.

 

Reuse Only the Code That Matters

We find no value in reusing Gherkin statements, and therefore, we find no value in reusing step definitions. I understand how that might sound counter-productive, but suspend your disbelief for a moment.

One anti-pattern we immediately noticed while prototyping with the traditional BDD frameworks was that our code started to reflect the limitations of the frameworks, as opposed to reflecting good, sound software engineering best practices. It doesn’t make sense to treat a codebase of functional tests any differently than you would a production codebase. Clean-code practices that promote maintainability and general quality have been established and proven for decades; why not use them?

So our best practice is to write a series of client libraries for our own products. These client libraries are stateless and are reused throughout the entire test suite. If two independent features need to perform some common actions, they each would implement a step function that uses the shared library code.

The beauty of this pattern is that if we were to completely delete all of the testing-specific code (i.e. the feature files and step definitions), we would still have a fully functioning codebase that is properly factored and follows all our standards. This is good, and this is simple.

The elephant in the room is that combined with coupling Gherkin feature files to step definitions, factoring all shared code into stateless libraries means that a step function must be mapped to it for each feature, so there is some code repetition. While this is true, and many theorists would say not to repeat yourself, we feel that it’s intentional and meaningful repetition. Ignore the fact that the same text exists in multiple parts of the code for a moment and realize that the location of each step definition function is the unique part. Again, since each feature is considered its own context, just like how a human considers a new conversation a separate context from another, it can be said that each step definition is unique due to where it resides, not necessarily the text that defines it.

But we did say that we want to be practical. There are situations where reusing step functions makes a lot of sense, so we do allow for programmatic inclusions of steps from any file. It must be done deliberately; we intentionally do not want a framework to do it automatically. This is most definitely the exception to the rule, but it is certainly reasonable, so we allow it.

Above all, stop worrying about repeating minor boilerplate code. It simply does not matter. Move on and be productive. Get your work done and be happy.

 

To Be Continued…

Coming up next, we’ll talk about how we actually implemented SURGE as a framework, as well as our observed results. Spoiler alert: we became outrageously productive.

Until then, if this stuff is exciting to you, or even if you think we’re completely wrong and know you can kick it to the next level, we’d love to hear from you.

Happy coding! 🙂


Gerry Hernandez began his career as a researcher in various fields of digital image processing and computer vision, working on projects with NIST, NASA JPL, NSF, and Moffitt Cancer Center. He grew to love enterprise software engineering at JP Morgan, leading to his current technical interests in continuous integration and deployment, software quality automation, large scale refactoring, and tooling. He has oddball hobbies, such as his fully autonomous home theater system, and even Christmas lights powered by microservices.

Electronic spreadsheets have been a mainstay of business operations since their introduction four decades ago, but the way organizations use them has changed significantly during that time. Today, the financial industry needs FinTech accounting software that facilitates online spreadsheet collaboration without creating unnecessary risk or disrupting workflows. 

Spreadsheets in the Tax and Accounting Industry

Although many tax and accounting firms use dedicated software solutions to manage complex financial workflows, they still rely on conventional spreadsheets for a variety of tasks. In fact, a recent study by Deloitte found that 62% of companies are still relying heavily upon spreadsheets for business insights. The data used to inform risk analysis, growth projections, and financial modeling is often collected and sorted in individual spreadsheet files by individual employees. In many instances, that data will eventually be transferred into a more sophisticated accounting platform, either through manual entry or an API integration.

Spreadsheets also play a critical role when it comes to presenting complex financial data. Whether it’s for an internal presentation to key stakeholders within the organization or a customer-facing report designed to relay important information about their business, tax and accounting firms routinely need to create, edit, view, and share spreadsheets. 

Although Google Sheets has gained quite a bit of traction over the last few years, Microsoft Excel remains the preferred spreadsheet solution for most financial industry professionals. Practically every CRM and CMS platform allows users to easily export data into Excel’s XLSX file format for convenient viewing, making it the de facto standard for most companies. Online spreadsheet collaboration is also easier than ever before thanks to public cloud tools like Office 365.

5 Major Spreadsheet Collaboration Challenges

Unfortunately, all of that ubiquity and convenience comes with a few drawbacks. There are also some inherent shortcomings with Excel spreadsheets that pose significant challenges to tax and accounting firms in particular.

1. Version Control

One of the great benefits of spreadsheets is their ability to track data over time, with new information constantly being fed into the spreadsheet formula to generate different results. Unfortunately, that typically means that the document could potentially be outdated the moment it’s copied, shared, or downloaded because a more current version might exist elsewhere. While cloud-based software like Google Sheets or Office 365 theoretically ensure that everyone is viewing and referencing the same document, if there are too many people making changes, errors can easily escape notice and break entire spreadsheet formulas (or possibly corrupt the file). Even then, people may clone their own version to work on independently, which creates the same version control challenge posed by Excel-dependent files. 

2. Security

Familiarity has a way of breeding complacency. That’s certainly true when it comes to sharing XLSX files. People are accustomed to sending and receiving spreadsheets over email and other messaging platforms. What they may not realize, however, is that 38% of malicious email attachments disguise themselves as Microsoft Office file types. The last thing a tax or accounting firm wants is for an employee to accidentally infect their network with harmful malware by opening what they thought was a spreadsheet. At the same time, even conventional spreadsheet collaboration can pose a serious security risk. Excel files offer limited security controls, and downloaded or shared files could be easily hacked to compromise important financial data. With more people working remotely in response to the COVID-19 pandemic, FinTech accounting software needs to account for the common security risks posed by home offices while still meeting consumer demands for high-speed, low-friction digital solutions in 2020 and beyond.

3. Asset Protection

Spreadsheets often contain more than just important financial data. The spreadsheet formulas buried within the many rows and columns of cells may represent important intellectual property for a tax or accounting firm. Any time a company shares a spreadsheet, it runs the risk of those formulas being stolen and distributed. Even if these proprietary assets remain safely tucked away within the spreadsheet, there’s still the matter of anyone with a copy of the file being able to use it however they want, potentially cutting into the firm’s business.

4. Workflow Efficiency

Managing a large number of independent XLSX files can quickly become burdensome for any organization. Take, for example, a situation where a tax firm’s customers must download a spreadsheet to enter their tax information and then send that file back to the firm so the data can be entered into its FinTech accounting software. Not only does this create numerous opportunities for manual errors, but it also introduces several unnecessary (and potentially risky) steps into the process. What if a file is not attached to an email? Or if someone downloads the spreadsheet, but then misplaces it? How does the tax firm verify that the version sent back to them is the most up-to-date version? This approach to spreadsheet collaboration ends up wasting time and is highly prone to mistakes.

5. Software Dependencies

While Excel may be the most widely used spreadsheet software in the world, that doesn’t mean every organization has access to it. Smaller companies and startups are much more likely to rely upon cloud-based tools like Google Sheets due to their low cost and ease of online spreadsheet collaboration. Although Google’s Chrome browser offers extensions capable of reading, viewing, and editing XLSX files, the conversion process is often imperfect due to differences in feature sets. Transferring data back and forth between Excel and other spreadsheet programs can create formatting problems and potentially break internal formulas. 

The PrizmDoc Cells Solution

One of the best ways for FinTech accounting software developers to address these issues is to simply integrate spreadsheet viewing and editing functionality into their applications. PrizmDoc Cells is a web-based spreadsheet editor that natively supports XLSX files by storing them on a secure server and allowing users to interact with them online through an Excel-like interface. 

Secure Spreadsheet Functionality

PrizmDoc Cells provides essential spreadsheet features within a familiar UI. After opening an XLSX file, users can review and edit cell content within a secure web-based environment. Firms can also restrict features to protect spreadsheets from errors and unauthorized alterations. 

No Microsoft Dependencies

Deployed entirely within a Docker container, PrizmDoc Cells can import, view, edit, and export XLSX files entirely within a firm’s FinTech accounting software or web-based application. No one needs access to a copy of Microsoft Excel to access files.

Manage End-User Access

In addition to hosting their source files securely within a proprietary server or private cloud environment, organizations can control what end-users can access within the spreadsheet. Proprietary data and spreadsheet formulas can be safely hidden from view to protect valuable IP.

Maintain Version Control

As an entirely web-based viewer, PrizmDoc Cells eliminates the need to email, copy, or download spreadsheets, ensuring that the file being viewed is always the most up-to-date version. Editing access can also be adjusted to ensure that only authorized users are able to make changes.

White Label Customization

Developers can easily remove all branding to seamlessly integrate PrizmDoc Cells with their applications and FinTech accounting software.

Say Goodbye to the Old Way of Spreadsheet Collaboration

Today’s tax and accounting firms need to work more efficiently than ever before to keep up with the demands of their clients. They can’t afford to keep relying upon outdated approaches to spreadsheet collaboration. The pressure is on for FinTech developers to build applications capable of accommodating their security, workflow, and version control requirements when it comes to spreadsheets. 

With PrizmDoc Cells, developers can build FinTech accounting software solutions that allow for true online spreadsheet collaboration without compromising the security or control organizations expect from their applications. Experience the functionality of PrizmDoc Cells firsthand by trying a demo today. To get a closer look at how PrizmDoc Cells will operate in your own development environment, sign up for a free trial.

It’s a business battlefield out there. Not one of munitions and machines, but time and resources. Companies are struggling to provide end-users and consumers with the content they need, when they need it, without breaking the bank. Document management now helps companies make progress without losing productivity.

As noted by the SocioHerald, document management solutions are “booming worldwide” and on track for significant growth over the next five years, but as data volumes increase and connectivity allows simple sharing of more complex and media-rich content, large documents pose a new challenge. How do organizations deliver high-volume content quickly and accurately to drive on-demand end-user interaction?

Accusoft’s PrizmDoc Viewer can help deliver peace of mind — and win the large document loading war — with dual-pronged delivery of document pre-conversion and server-side search.

The Need for Speed

As noted by Forbes, one second is now the “magic number” when it comes to loading webpages — any slower and potential consumers begin to abandon ship. Welcome to the future.

Employees are now used to this kind of rapid retrieval when they search for data online, so they bring these same expectations into the office when it comes to document loading and access times. What does this mean in practice? Both user satisfaction and overall productivity suffers when documents don’t load fast enough.

So how do companies get to the finish line faster? Start with document pre-conversion. PrizmDoc Viewer contains a pre-conversion API that allows companies to create viewing packages for large documents using POST requests and JSON formatted source objects. Combined with the PAS layer of PrizmDoc server, this pre-conversion feature allows massive documents — such as Tolstoy’s 1493-page epic War and Peace — to load in just 0.69 seconds.

The caveat? Pre-conversion isn’t enough in isolation. To ensure users find what they’re looking for, and fast, organizations also need the benefit of server-side search.

Search and Rescue

Eighty percent of Americans now experience some type of “tech frustration” every day. Spotty connections and smartphone failures top the list, but documents also make the cut. Client-side searches within large documents can put a strain on a browser-based document viewers’ memory load. The best case scenario? Massive load times that frustrate staff efforts. Worst case? Complete viewer crashing as the browser overloads.

There’s a better way. With PrizmDoc Viewer’s server-side search feature, you can offload search work to the server, significantly reducing the strain on client-side viewer code. Using PrizmDoc’s Viewer configuration options, developers can also create custom server-side search parameters to reduce the strain on memory-capped browsers or more easily access text-heavy documents. Put simply? Server-side search can help rescue document retrieval speeds and reduce user frustration.

Document Detente

Slow-loading, large documents can ramp up hostilities between staff trying to get their work done and the tech initiatives that supposedly boost productivity. Fortunately, there are ways to reduce loading times and achieve document detente with PrizmDoc Viewer. Accusoft’s pre-conversion APIs and customizable server-side search parameters make this tech treaty even easier to achieve with straightforward in-app integration, providing complete functionality under the banner of in-house applications.

Ready to ramp up productivity and win the war on large document loading? See server-side speed in action with the server-side search demo or enlist the in-app advantage with a free trial today!