Technical FAQs

Question

We are adding files to the viewing session with HttpWebRequests. We noticed with larger files the response is:

(413) Request Entity Too Large. -  at System.Net.HttpWebRequest.GetResponse(). 

What could be the cause?   

Answer

A 413 request entity too large error occurs when a request made from a client is too large to be processed by the web server. If your web server is setting a particular HTTP request size limit, clients may come across a 413 request entity too large response. An example request that may cause this error would be if a client was trying to upload a large file to the server (e.g., a large media file).

Depending on which web server you use, implement the necessary changes described below to configure your web server’s maximum HTTP request size allowance. Below are some suggestions for popular Web Servers:

For Nginx users:

The directive that determines the allowable HTTP request size is client_max_body_size. This directive can be defined in your nginx.conf file located at /etc/nginx/nginx.conf

For Apache users:

The directive is LimitRequestBody which can be defined in your http.conf file or in an .htaccess file.

For IIS users:

  1. Open Internet Information Services (IIS) Manager.
  2. In the Connections pane, go to the connection, site, application, or directory for which you want to modify your request filtering settings.
  3. In the Home pane, double-click Request Filtering.
  4. Click Edit Feature Settings in the Actions pane.

In document viewing and processing solutions, change is inevitable and often necessary to keep pace with evolving technologies and market demands. As such, we are thrilled to announce the rebranding of VirtualViewer® to PrizmDoc® for Java. This transformation accentuates Accusoft’s unwavering commitment to providing cutting-edge, secure document-viewing solutions to our valued customers and partners.

Why the Change? 

Several key factors drove the decision to rebrand VirtualViewer® as PrizmDoc® for Java, aligning with our mission to deliver innovative and streamlined solutions to the market.

Rebranding VirturalViewer® to PrizmDoc® for Java solidifies Accusoft’s dedication to offering state-of-the-art document-viewing solutions. By folding VirtualViewer® into the PrizmDoc® brand, we can present our current and potential clients with a more cohesive and comprehensive product lineup. This streamlined experience makes it easier for customers to navigate our offerings and find the perfect solution to meet their document-viewing and processing needs.

Renaming VirtualViewer® as PrizmDoc® for Java clarifies its positioning within our product portfolio. Prospects can now readily identify PrizmDoc® for Java as Accusoft’s Java-based option for document viewing and processing. This clear delineation enhances brand recognition and facilitates informed decision-making for potential customers.

What Does PrizmDoc® for Java Have to Offer?

Renaming VirtualViewer® as PrizmDoc® for Java solidifies Accusoft’s commitment to offering our customers options for Document Viewing and Processing that meet their unique needs. PrizmDoc® for Java boasts robust features designed to empower users with unparalleled document-viewing capabilities. From rendering high-fidelity documents with lightning speed to enabling seamless collaboration and annotation, PrizmDoc® for Java is engineered to optimize productivity and efficiency across various industries and use cases. You’ll also still get the same robust document support, easy-to-use format in any environment, and quick installation/integration that you’re used to with VirtualViewer®

Moving Forward

PrizmDoc® for Java represents a new name and a bold step forward in our ongoing mission to redefine the document-viewing landscape. This rebranding supports Accusoft’s commitment to offer innovative, secure document-viewing solutions. VirtualViewer’s® transition to PrizmDoc® for Java signifies more than just a name change—it exemplifies our commitment to excellence and dedication to providing superior document-viewing solutions.

To learn more, visit the PrizmDoc® for Java product page.

 

Written by: Cody Owens

    • How quickly can your team take a code base, package it, test it, and put it into the hands of your customers?

 

    • Can you push a button to automagically make it happen?

 

  • And, once your customers have your product, can you be confident that it will work for them out of the box?

We at the Accusoft PrizmDoc group asked ourselves those questions in 2016 and discovered a wide array of opportunities to improve how we deliver our product.

Here we share how we reduced our feedback cycle from three months to three days, enabling rapid delivery of beta PrizmDoc builds and confident, seamless delivery of release builds.


What is Continuous Delivery?

Continuous Delivery, the movement toward rapid working releases, focuses our efforts on knowing as quickly as possible when to release a change to customers. Whereas Continuous Integration focuses on taking code and packaging it, Continuous Delivery goes a step further by identifying what to do with that package before release.

What is Continuous Delivery

A common assumption is that when code works in one environment, it should work in others. But through the lens of Continuous Delivery, we have to assume our product is guilty until proven innocent. And how do we prove its innocence? Automated testing in production-like environments. In this way, we can be confident that at release time our product won’t explode on takeoff.

Explosion

Moving from testing on a small, dedicated environment to a many production-like environments can be complex. But implementing a Continuous Delivery release workflow is well worth the effort. The product will be deployable throughout its lifecycle. Everyone – not just the development team – can get automated feedback on production readiness at any time. Any version of the product can deploy to any environment on demand. And in our case, beta builds can release to customers for early feedback on bug fixes and new features. Together, we realized that these benefits far outweigh the cost of putting up with release pain year after year.

 


Evaluating Our Starting Point

Like most modern software teams, we believe in the value of test-driven development. We already had many thousands of unit, contract, and integration tests verifying the ability of our products to solve business needs. So, we could be confident that the product could run on some specific environment with some specific configuration. But there were a few key problems we had to address:

  • Spawning many production-like environments was uneconomical
  • We could not automatically test the GUI of our product
  • There were no explicit, automated performance tests against business-valued behaviors

Testing On Realistic Environments

We tackled the expense of production-like environments first. At the time, we were using Amazon Web Services EC2 instances for build agents that could test and package code. On each code change, new instances launched to run tests. While these instances were fast, reliable and cloud-based, they were uneconomical. And because spending gobs of money is effortless when spawning instances for testing or development, access was guarded. Reevaluating our needs, we realized that scalability or flexibility of the cloud wasn’t necessary for testing purposes. We knew we needed to shut off the cloud-hosted cash vacuum – but what was our alternative?

Hybrid cloud is becoming attractive as a best-of-both-worlds solution to cloud-hosting needs. Perhaps a more accurate term is “local cloud hosting” – on-prem value but with most of the features offered by the “real” cloud. To this end, we turned to OpenStack as our EC2 replacement for development builds. With OpenStack, we can still spin up instances, store VM images and snapshots, create load balancers and more without the cost associated with the cloud. A single investment in the local hardware was comparable in cost to one additional year of cloud usage. If it didn’t turn out so well, we could just switch back a year later.

After flipping the switch, we transferred our build agents to OpenStack’s hybrid cloud. Before, some tests took many hours and could only run once per day or even once per week. But with the reduction in testing costs, we now run the dailies at every commit and the weeklies every day. This difference in feedback time is monumental; developers can be confident that their new code won’t fail automated tests a week later after the user story has closed.

As we increased our hybrid cloud test agent workload, we ran into a unique problem. As opposed to running instances in the “real” cloud, we now have to deal with hardware limitations. We have a specific number of physical CPUs available. We have a specific amount of memory to use. This forced us to rethink what tests we ran and how we ran them.

Hybrid Cloud Test


Failing Fast, Failing Cheap

To optimize our resource usage, we need bad commits or configuration changes to fail fast and early. When one stage fails, the next stage(s) shouldn’t run because that build isn’t releasable. We needed a way to schedule, chain and gate test suites.

Chain and gate Test Suites

Enter Jenkins. Jenkins is a flexible build system that enables a simple pipeline-as-code setup for all sorts of purposes. In our case, we opt to use it as the platform that pulls the built product installer, installs it and runs batteries of progressively stringent tests against it. A stage can run tests against multiple nodes. We created production-like nodes that launch from our hybrid cloud and use the built-in gating functionality in Jenkins. Subsequent test stages don’t run following a test failure. Since pipelines are version controlled, we always know exactly what changes affect a given run.

Functional Test


Testing Like A User

By this point, our tests can run inexpensively and easily across production-like environments. This enabled us to rethink what tests we were running and build upon our coverage. At release time, we spent a sprint across two teams just to test deploying the product and pushing buttons to verify the GUI worked. The plain English test instructions were subject to interpretation by the tester, leading to nondeterministic results from release to release. This cumbersome effort was necessary to test the GUI sitting on top of our core API product.

While this manual testing process uncovered bugs nearly every release, it was unthinkable in terms of ROI per man-hour. The late feedback cycle made product GUI changes stressful. A developer might not know that the GUI component they just added is unusable on an Android device running Firefox until the release testing phase three months later. Finding bugs at release time is dangerous, as not all bugs are always resolved before the release deadline. Regressions and bugs might make their way into the product if they’re not severe, or they might postpone delivery of the product altogether.

Automating these types of manual tests improves morale, reduces feedback time and asserts that the GUI either passes or fails tests in a deterministic way. Furthermore, it opens a route to Behavior Driven Development language that centers around business-valued behaviors on the front end of the product. For instance, we use the Gherkin domain-specific language to author tests in plain English that are parseable by a testing code base into real executed test code. Non-technical members of the team can author plain English “Given [state of the product], When I [do a thing], Then [a result occurs]” feature descriptions that map 1:1 to test code.

Automating Manual Tests

Today, all major browsers have automation REST APIs to enable driving them in a native non-JavaScript way without a real user. To eliminate the hassle of changing test code between browsers or authoring reliable tools to talk to those automation APIs, we use Selenium WebDriver. WebDriver is available in many popular languages including Java, Python, Ruby, C#, JavaScript, Perl and PHP.
From BDD test code, we execute end-user tests with WebDriver to verify real usage of the product. Because the WebDriver APIs enable “real” user events and not JavaScript event simulations, we can be confident that mouse, touch and keyboard actions actually do what we expect across a range of platforms. On test failures, we take a screenshot and save network traffic logs from the browser to trace the failure back to a front end or microservice source. Some test authors even automatically save a video of the last X seconds leading up to the failure to investigate unexpected, hard-to-reproduce behavior.

Simple Edge Test

Altogether, these new front-end tests enable us to supplant the rote work of fiddling with the product across different browsers and devices for each release. They give us rapid feedback for every commit that the product has not broken for a front-end user. Before, we couldn’t know until release testing. Development confidence goes way up and agility improves as we can guarantee that we won’t have to interrupt the next sprint to fix an issue introduced by new code.


The Value Of Manual Tests

This is not to say that manual testing should be supplanted by automated testing. Exploratory testing is necessary to cover complicated scenarios, unusual user behaviors and platforms that automated tests don’t cover. Not everything is worth the time investment of automating. Bugs found during exploratory tests can be fixed and later covered by automated tests.

Your product’s test coverage should look like a pyramid where unit test coverage is thorough, integration tests are somewhere in the middle, and product-level end user tests are broad but not deep.

Test Coverage Pyramid

As expensive as manual testing can be, authoring and maintaining end-user tests can be expensive if done poorly. Changes to the front-end of the product can break all the GUI tests, though using the Page Object design pattern can mitigate this. Browser updates can also break end-user tests. Poor product performance can lead to unexpected behavior, resulting in failed tests. And not all browser platforms support all parts of the WebDriver spec, resulting in edge cases where JavaScript does need to be run on the page on that platform to fill in the gap.

Keep end-user tests broad and don’t use them as a replacement for in-depth, maintainable integration and unit tests. If a feature is testable at the unit or integration level, test it there!
On the PrizmDoc team, we’ve freed up weeks of regression testing time at release through adding these end-user automation tests. After cursory end-user regression tests, we host a fun exploratory Bug Hunt with prizes and awards.

Who can find the most obscure bug? The worst performance bug? Who can find the most bugs using the product on an iPad? Your team can gear testing efforts towards whatever components are most important to your customers and raise the bar on quality across the board.


Automating Nonfunctional Tests

Performance and security, among other nonfunctional requirements, can be just as important to our customers as the features they’ve requested. Let’s imagine our product is a car. We know that the built car has all the parts required during assembly. We also know that the car can start up, drive, slow down, turn and more.

But we don’t know how fast it can go. Would you buy a car that can only go 20 MPH? What if the car didn’t have door locks? These concerns apply similarly to our software products.

The next step, then, is to automate tests for nonfunctional requirements. Even one bad commit or configuration change can make the product unacceptably slow or vulnerable. So far, we have added automated performance tests using Multi-Mechanize. Many similar tools can accomplish this task so there’s no need to dive into details, but the key point is configurability.

Our customers don’t all use the same hardware, so it doesn’t make sense to test on every possible environment. Instead, we focus on measuring performance over time in a subset of production-like environments. If performance goes below a particular threshold, the test fails. With configurability in mind, if a customer is evaluating whether to use PrizmDoc, we can simply deploy to a similar environment (CPUs, memory, OS type, license, etc) and gather metrics that will allow them to easily plan capacity and costs, which can often seal the deal.

And since performance tests run on every successful change, we can gauge the impact of optimizations. For example, we found that a microservice handled only two concurrent requests at a time. The fix? A one-line change to a configuration parameter. Without regular performance tests, gathering comparative performance and stability would be difficult. With regular performance tests, however, we were confident in the value of the change.

Performance Test


Real Impact

Continuous Delivery has improved every aspect of the PrizmDoc release cycle. Customers praise our rapid turnaround time for hotfixes or beta build requests. We now thoroughly measure the delivery value of each commit. End-user tests verify the GUI and performance tests cover our nonfunctional requirements. The built product automatically deploys to a range of affordable production-like environments. Any member of the product team can get release readiness feedback of the current version at a glance. Instead of a three month feedback cycle, developers see comprehensive test results against their changes within a day. The difference in morale has been tremendous.

If your organization is not quite there yet, we challenge you to start the Continuous Delivery conversation with your team. Hopefully our experience has shed light on opportunities for your product to make the jump. You might get there faster than you expect.

 

About the author

Cody Owens is a software engineer based in Tampa, Florida and contributor to continuous deployment efforts on Accusoft’s PrizmDoc team. Prior to his involvement with document management solutions at Accusoft, he has worked in the fields of architectural visualization and digital news publishing. Cody is also an AWS Certified Solutions Architect.

Jira REST API

Jira REST APIs are used to interact with the Jira server for several purposes. Basically, they provide access and interaction with features like issues and workflows. In this blog, we are interested in sharing how to query epics, stories, and access logged work time to provide a way to estimate the time for release tasks using the Jira REST APIs.


Using Python Wrapper for the Jira API

The Jira REST APIs can be accessed in several ways. For example, they can be invoked using a POST request with the appropriate parameters. There are wrappers for specific languages such as R (used in statistical analysis) and Python. For the purposes of this article, the examples and other details, we will use Python. When using Python, a script should start like this:


Release Tasks in SDK

For analysis of release tasks in the SDK group, the Jira stories, bug fixes, incidents addressed, and in general any components or features addressed by a given release are handled primarily through a release epic that should be in place for the release tasks. This is a common best practice to keep things well organized. 

A specific query written in Jira Query Language (JQL) is required to use release epics stories and sub-tasks, incorporated in a Python script. The syntax looks like this (notice that the variable ‘epickey’ contains the epic number for the specific release of interest). Notice that the query selects only stories or bugs with status DONE or RESOLVED:


Querying Jira with the Python Wrapper to Retrieve Reported Times

Once we get the stories that are with status Done or Resolved for a given release epic, it is then possible to get the reported times with the following lines of Python code. Note that the code prints the report in easy to read values for convenience:

 

Using the Jira API is a convenient method to retrieve information that would be otherwise complicated to do with pure Jira queries. A language such as Python allows for data formatting and other operations that permit efficient and clear data analysis to keep track of projects.

TAMPA, Fla. – Accusoft, the leader in document and imaging solutions for developers, is proud to announce its beta release testing program, which provides participants with real-time access to its latest product developments.

Customer input is a key factor in Accusoft’s mission to build better software integrations that deliver functionality like OCR, image cleanup, forms processing, file manipulation, and viewing solutions. Thanks to the new beta program, participants will get early access to brand new products and have the opportunity to provide feedback on the latest features for existing products. Developers can also customize what types of betas they would like to opt into so they can focus on products most relevant to their business.

“Our previous betas for PrizmDoc Editor and PrizmDoc Cells were extremely beneficial for everyone involved, “ says Mark Hansen, Product Manager. “Our team received rapid feedback that helped make our products better, while participants had the opportunity to shape those products to meet their specific requirements.”

By signing up for the beta program now, you can participate in the active beta for PrizmDoc Forms integration, which will allow you to repurpose (or use) your PDF forms to easily create, customize, and deploy as web forms anywhere. You’ll also be the first to know about new product offerings and have the ability to opt into beta releases for Accusoft’s existing products, such as ImageGear, FormSuite for Structured Forms, and PrizmDoc Suite.

To learn more about Accusoft’s exciting new beta program, please visit our website at https://www.accusoft.com/company/customers/beta-release-program.

About Accusoft:

Founded in 1991, Accusoft is a software development company specializing in content processing, conversion, and automation solutions. From out-of-the-box and configurable applications to APIs built for developers, Accusoft software enables users to solve their most complex workflow challenges and gain insights from content in any format, on any device. Backed by 40 patents, the company’s flagship products, including OnTask, PrizmDoc™ Viewer, and ImageGear, are designed to improve productivity, provide actionable data, and deliver results that matter. The Accusoft team is dedicated to continuous innovation through customer-centric product development, new version release, and a passion for understanding industry trends that drive consumer demand. Visit us at www.accusoft.com.

James Waugh, Accusoft Software Engineer

In many cases at Accusoft, we are tasked with commanding large TeamCity build chains, sometimes interacting with dozens of build configurations: build tests, run them, and build all their dependencies! Being able to visualize these relationships is a benefit to my team that allows them to discuss these relationships and discover potential issues, so I created a Python 2.7 script to do so.

The script is invoked by passing the TeamCity ID whose graph is requested. This ID can be found in the settings of the TeamCity configuration in question.

./tcdependencygraph.py My_Project_ID

Here’s what the output looks like:
TeamCity Dependencies

The gist to accomplishing this is:

  • Get a project’s dependencies as JSON
  • For each of that project’s dependencies, recursively traverse them and construct a graph
  • Render the graph utilizing the Dot language and Graphviz.

The required libraries are requests, pydotplus, and networkx:

pip install requests pydotplus networkx

For all network requests, such as that to the TeamCity API and Google Charts, the requests module is used. First, we start with the TeamCity API. We are only concerned with one endpoint here:

/guestAuth/app/rest/buildTypes/id:/

This will give us the properties of a project. /guestAuth/ is used to simplify authentication. In code, it is utilized like so:

def getProjectJson(projectId):
    url = "{0}/guestAuth/app/rest/buildTypes/id:{1}/".format(server, projectId)
    headers = {"Accept": "application/json"}
    response = requests.get(url, headers=headers)
    status = response.status_code
    if status is not 200:
        raise Exception("Project ID '{0}' cannot be read (status: {1})".format(projectId, status))
    return response.json()

For this script, I preferred to return that data in JSON instead of XML. This is done by setting the “accept” header to “application/json”. In the returned JSON, The important bits are the snapshot-dependencies and artifact-dependencies properties:

  "snapshot-dependencies": {
    "count": 0
  },
  "artifact-dependencies": {
    "count": 4,
    "artifact-dependency": [
        ...
        "source-buildType": {
           ... 
        }
    ],
  }

Each of these array entries represents a dependency in the project. Inside each element, the “source-BuildType” is given.

"source-buildType": {
    "id": "ProjectIDHere",
    "name": "..",
    "projectName": "..",
    "projectId": "..",
    "href": "..",
    "webUrl": ".."
}

This is the holy grail: The link to the next project. This is modeled as a node in the graph, and edges are added corresponding to their relationships though the source-buildType. We can again get the dependency’s name and JSON through our getProjectJson function, and recursively repeat this process until a project has no dependencies.

The script utilizes Python’s NetworkX (NX) library to internally create a graph. Nodes and edges are added to the graph based on the type of dependency encountered: black edges for artifact dependencies, and red for snapshot. Then, PyDotPlus is used to obtain the Dot source code. The NX graph is first converted to a PyDot graph by integration that NX provides, and then to a string.

dotSource = nx.drawing.nx_pydot.to_pydot(graph).to_string()

For large graphs, sometimes changing the rank to LR results in a better image:

    dotGraph = nx.drawing.nx_pydot.to_pydot(graph)
    dotGraph.set('rankdir', 'LR')
    dotSource = dotGraph.to_string()

The Google Charts API is used to render the graph’s image from this source string. In this request, the Dot source code is URL-escaped: most notably changing spaces to ‘+’. When generating large graphs, sometimes you may get a 413 (Request Too large) error. If this happens, the provided Dot source can be rendered using your favorite Graphviz interface.

def renderGraph(chartName, dotSource):
    baseurl = "https://chart.googleapis.com/chart?chl={0}&cht=gv"
    r = requests.get(baseurl.format(dotSource), stream=True)
    status = r.status_code
    if status is not 200:
        raise Exception("Could not generate graph (status: {0})".format(status))
    return r.raw

A final note is that, since the names of projects and their dependencies can be nearly identical except for the last configuration name, they should be cut down to fit better on the nodes of the graph. The method here eliminates all common portions between the “::” of the TeamCity full project name. For this, Python’s difflib function get_matching_blocks is used to compare it to the starting project’s name.

def trimProjectName(topProjectName, destProjectName):
    # Split on "::". Return the number of starting array elements that match in both.
    topSplit = topProjectName.split(' :: ')
    destSplit = destProjectName.split(' :: ')
    sequenceMatcher = SequenceMatcher(None, topSplit, destSplit)
    match = sequenceMatcher.get_matching_blocks()[0]

    # If there are no common elements, return the full name of the project.
    # Otherwise the beginning strings are cut off.
    if match.size is 0:
        return destProjectName

    return " :: ".join(destSplit[ match.b + match.size : ]).encode('ascii')

The common elements are removed, and the result is combined with the configuration’s name.

The final script is as follows:

#!/usr/bin/env python
# Script to create a graph of a TeamCity project's dependencies
import json, sys, re, urllib, shutil
from difflib import SequenceMatcher
import requests
import networkx as nx

#Configuration
server = "http://your.teamcity.server.com"

def renderGraph(chartName, dotSource):
    baseurl = "https://chart.googleapis.com/chart?chl={0}&cht=gv"
    r = requests.get(baseurl.format(dotSource), stream=True)
    status = r.status_code
    if status is not 200:
        raise Exception("Could not generate graph (status: {0})".format(status))
    return r.raw

def generateGraph(projectId):
    G = nx.MultiDiGraph(name=projectId)
    topNode = getProjectJson(projectId)
    generateGraph_impl(G, topNode, topNode)
    return G

def getProjectJson(projectId):
    url = "{0}/guestAuth/app/rest/buildTypes/id:{1}/".format(server, projectId)
    headers = {"Accept": "application/json"}
    response = requests.get(url, headers=headers)
    status = response.status_code
    if status is not 200:
        raise Exception("Project ID '{0}' cannot be read (status: {1})".format(projectId, status))
    return response.json()
    
def generateGraph_impl(outputGraph, topNode, node):
    addDependenciesToGraph(outputGraph, topNode, node, 'artifact-dependencies')
    addDependenciesToGraph(outputGraph, topNode, node, 'snapshot-dependencies') 

def addDependenciesToGraph(outputGraph, topNode, node, dependencyName):
    for dep in getDependencies(node, dependencyName):
        addDependencyToGraph(outputGraph, topNode, node, dep, dependencyName)
        generateGraph_impl(outputGraph, topNode, getProjectJson(dep['id']))     

def getDependencies(nodeJson, dependencyName):
    arrayNameMap = {
        'snapshot-dependencies' : 'snapshot-dependency',
        'artifact-dependencies' : 'artifact-dependency'
    }
    dependencyDict = nodeJson[dependencyName]
    result = [ ]
    if int(dependencyDict['count']) != 0:
        for dependency in dependencyDict[arrayNameMap[dependencyName]]:
            result.append(dependency['source-buildType'])
    return result

def addDependencyToGraph(outputGraph, topNode, source, dependency, dependencyName):
    edgeColors = {
        'artifact-dependencies': 'black',
        'snapshot-dependencies': 'red'
    }
    nodeName, depNodeName = getNodeNames(topNode, source, dependency)

    # Check if this edge has the same color. e.g, snapshot or artifact dependency.
    # We will add another edge if a different dependency exists, or if there is no edge
    edgeColor = edgeColors[dependencyName]
    if not outputGraph.has_edge(nodeName, depNodeName):
        outputGraph.add_node(nodeName)
        outputGraph.add_edge(nodeName, depNodeName, color=edgeColor)
    elif all(attrs['color'] != edgeColor for attrs in outputGraph[nodeName][depNodeName].values()):
        outputGraph.add_edge(nodeName, depNodeName, color=edgeColor)
        
def getNodeNames(topNode, srcNode, depNode):
    topProjectName = topNode['projectName']
    srcProjectName = srcNode['projectName']
    depProjectName = depNode['projectName']
    srcName = srcNode['name']
    depName = depNode['name']

    # If the project names start the same, cut off the common substring to be 
    # easier to read on the graph. This is combined with the name in brackets.
    srcResult  = trimProjectName(topProjectName, srcProjectName).replace(" :: ", ".")
    depResult  = trimProjectName(topProjectName, depProjectName).replace(" :: ", ".")
    srcResult += ' [' + srcName + ']'
    depResult += ' [' + depName + ']'

    return srcResult.strip(), depResult.strip()

def trimProjectName(topProjectName, destProjectName):
    # Split on "::". Return the number of starting array elements that match in both.
    topSplit = topProjectName.split(' :: ')
    destSplit = destProjectName.split(' :: ')
    sequenceMatcher = SequenceMatcher(None, topSplit, destSplit)
    match = sequenceMatcher.get_matching_blocks()[0]

    # If there are no common elements, return the full name of the project.
    # Otherwise the beginning strings are cut off.
    if match.size is 0:
        return destProjectName

    return " :: ".join(destSplit[ match.b + match.size : ]).encode('ascii')

if __name__ == '__main__':
    if(len(sys.argv) < 2):
        print "Usage: tcdependencygraph.py "
        sys.exit(1)
    id = sys.argv[1]
    
    # Generate Dot source
    graph = generateGraph(id)
    dotSource = nx.drawing.nx_pydot.to_pydot(graph).to_string()

    # Write Dot source
    with open(id + ".dot", 'wb') as f:
        f.write(dotSource)

    # Render and write PNG
    with open(id + ".png", 'wb') as f:
        pngImage = renderGraph(id, dotSource)
        shutil.copyfileobj(pngImage, f)

James Waugh is a Software Engineer on the SDK division at Accusoft. He joined the company in 2016 as a Software Engineer in Support. James now contributes to the native products team, and has a BS in Computer Engineering.

Today’s applications need tremendous versatility when it comes to document management. Developers are expected to deliver tools that can handle multiple file types and have the ability to share them securely with internal users and people outside the organization. As more companies transition to remote-first work environments, online (and secure) collaboration tools are becoming a must-have feature. One of the major challenges facing developers is how to adapt existing document technologies and practices to an increasingly interconnected environment without creating additional risks.

Rendering and Conversion Challenges of Microsoft Office

Microsoft Office (MSO) files have long presented problems for organizations looking for greater flexibility when it comes to viewing and marking up documents. This stems in part from the widespread reliance on the Office software itself, which held a staggering 87.5 percent share of the productivity software market according to a 2019 Gartner estimate. Companies of all sizes across multiple industries rely on programs like Word, Excel, and PowerPoint, but there are many instances where they would like to be able to share those documents without also surrendering control of the source files.

The challenge here is twofold. On the one hand, if an organization shares an MSO file with a client or vendor, there’s no guarantee that the recipient will be able to view it properly. They may not have access to Office, in which case they can’t open the file at all, or they may be using an outdated version of the software. While they may still be able to open and view the file, it may not display as originally intended if it uses features not included in previous editions of Office.

On the other hand, however, sharing files outside a secure application environment always creates additional risk. Microsoft Office documents are notoriously attractive targets for hackers seeking to embed malicious code into files, and older, unpatched versions of the software contain numerous known vulnerabilities. Sharing MSO files with an outside party could quickly result in the file being exposed to a compromised machine or network. There’s also a question of version control and privacy, as a downloaded file could easily be copied, edited, or even distributed without authorization.

Unfortunately, it has proved quite difficult to natively render MSO documents in another application. Anyone who has had the misfortune of trying to view or edit a DOCX file in Google Docs will understand the challenges involved. While it’s certainly possible to render MSO files in a different application, the end result is often a little off the mark. Fonts may be rendered incorrectly, formatting could be slightly (or drastically) off, and entire document elements (such as tables, text fields, or images) could be lost if the application doesn’t know how to render them properly.

Rendering MSO Files Natively with PrizmDoc Viewer

As a fully-featured HTML5 viewing integration, Accusoft’s PrizmDoc Viewer can be deployed as an MSO file viewer that renders them like any other document type. However, this doesn’t provide a true native viewing experience, which many businesses require for various compliance reasons. Fortunately, the PrizmDoc Server’s Content Conversion Service (CCS) allows applications to natively render MSO documents with a simple API call.

The MSO rendering feature allows PrizmDoc to communicate directly with an installed version of Microsoft Office, which ensures that every element of the file is rendered accurately within the HTML5 viewer. For example, a DOCX file opened in Microsoft Word should look identical to the same document rendered within an application by PrizmDoc Viewer. Once the document is accurately rendered, it can be shared with other users inside or outside an organization. This allows people to view and even markup MSO files without the original source file ever having to leave the secure application environment. It’s an ideal solution for reducing security risks and eliminating the possibility of version confusion.

Converting Additional MSO File Elements

In many instances, organizations need to share MSO files that have already been marked up or commented upon. This could include Word documents with multiple tracked changes or PowerPoint slides with extensive speaker notes. Those additional markups could be important elements that need to be shared or reviewed, so it’s critical to include them during the conversion and rendering process.

Using the server’s CCS, PrizmDoc Viewer can convert Word documents with accepted or rejected markup changes when converting the file into a different format (such as converting an MSO file to PDF) or rendering it for viewing in the application itself. The same capabilities extend to PowerPoint presentations with speaker notes. When converting these MSO files, the outputted version can consist of slides only or include the speaker notes along with them.

These conversion and rendering capabilities provide developers tremendous flexibility when they’re integrating viewing features into their applications. They can easily deploy them to help their customers collaborate and share MSO files without having to remove them from a secure environment. It’s also a winning feature for end users, who don’t need to worry about downloading files or having access to the latest version of Microsoft Office.

Improve Your Document Capabilities with PrizmDoc Viewer

With its extensive file conversion, redaction, and annotation capabilities, Accusoft’s PrizmDoc Viewer is an essential integration for any document management platform that requires an MSO file viewer. It provides support for dozens of file types to give applications the flexibility needed to meet the demands of today’s complex workflows and improve efficiency. As an HTML5 viewer, it can be integrated into any web-based solution with minimal development effort, which frees up valuable resources developers need to focus on the innovative features that will help set their applications apart in a competitive market.

To learn more about PrizmDoc Viewer’s robust feature set, have a look at our detailed fact sheet. If you’re ready to see what our HTML5 viewer will look like within your application environment, download a free trial and start integrating features right away.

multiple redaction

Redaction is a common practice for legal firms, healthcare organizations, finance institutions, and government agencies. Simply put, it’s the process of deleting or masking sensitive information in a document to prevent misuse and protect specific parties.  Simple redaction is no longer enough; modern applications need to support multiple redaction reasons.

In modern times, document redaction software has replaced the permanent marker of the past. However, while there are many solutions that allow for the electronic removal of protected or sensitive information from a variety of document types, only a few offer the ability to apply customized redaction reasons.


What are redaction reasons?

Redaction reasons help answer a key question: “Why?” They are similar to custom text that appears over a redaction area to indicate the reason the material was redacted. In government documents, these reasons often take the form of codes which represent specific redaction categories.

Consider these codes from the National Archives still commonly used for government document redaction:

  • 1.4(a) — This redaction code protects the publication of data relating to military plans, weapons systems, or operations.
  • 1.4(e) — Using this code indicates that the redacted information pertains to scientific, technological, or economic matters relating to national security.

 


A Gap in the Market: Support for Multiple Redaction Reasons

 

At Accusoft’s recent customer advisory board, we learned that our customers found the ability to configure PrizmDoc Viewer to replace sensitive content with a custom redaction reason to be immensely valuable. However, we also learned there was a gap in the market when it came to support for applying multiple redaction reasons to a given piece of redacted content. That is, until now.

We are pleased to announce that with the release of PrizmDoc Viewer v13.13, our client API now provides support for multiple redaction reasons. Users can apply multiple reasons, selected from a customizable list, to any redaction. 

These reasons are shown on top of the black box of redacted content, and can also be burned into a downloadable PDF along with the rest of the redacted content. In addition, application developers can import pre-built sets of redaction reasons into PrizmDoc Viewer from an existing JSON file to streamline custom redaction reason application for their end-users.

In v13.13, multiple redaction reasons can be easily added using four client-side methods:

  • Text Selection RedactionSimply select the text you want and hover over the context menu to select your preferred redaction reason.

multiple redaction reasons

  • Filled Rectangle Redaction Create a filled rectangle of any size and shape and then select or directly customize your redaction reason.

multiple redaction rectangle

  • Full Page RedactionsYou can also redact full pages and attach a specific redaction reason to indicate the purpose of redaction.

full page redaction

  • Bulk (Sticky) RedactionsRather than selecting a new redaction reason for each text block, sticky redactions save you time by applying the same redaction reason to multiple blocks of text on the same page.

multiple redaction sticky


Give It a Try

Ready to improve your redaction process with multiple redaction reasons that can be easily customized and applied across text, boxes, pages, or in bulk? Try the PrizmDoc Viewer redaction demo, or download a free trial today. Plus, stay tuned for an update to our server-side API for the programmatic application of multiple redaction reasons at scale, coming your way soon.