Skip to Main Content

Redaction in the Cloud: Controlling Disclosure at a Distance

Sixty-nine percent of businesses are now moving critical applications to the cloud. And while shifting software and solutions off-site helps boost agility and streamline operations, almost 60 percent of organizations cite increased security concerns, with 65 percent identifying the movement of sensitive data as their top information security priority.

This changing security landscape has ramped up reliance on encryption and obfuscation to frustrate hackers and secure critical data. But what happens if secure documents are released as part of eDiscovery requests or compliance requirements? What if zero-day attacks circumvent advanced firewalls or detection systems?

To manage the expanding impact of cloud migration, there’s a growing need for cloud-based redaction tools capable of effectively obscuring critical data and reducing total risk.

Keep It Secret; Keep It Safe

What is redaction? The American Bar Association (ABA) defines redacted information as “confidential text and images in a document that have been censored, deleted, or obscured.”
While the concept is simple, effectively applying secure redaction is more complicated. According to the United States District Court, Eastern District of California, insecure redaction methods include:

  • Changing the Font to White or Highlighting in Black — Both of these methods appear to redact information, but share the same problem. They can be easily removed by highlighting the text with a mouse cursor.
  • Covering Text with Physical Barriers — While using tape, marker, or paper to cover up critical information and then scanning the documents seems safe, repeated words or phrases in the text may allow viewers to uncover redacted data.
  • Deletion — Of all basic redaction methods, deletion seems the safest, but as noted by the Court guidelines, Word processing programs retain “metadata” which contains version and revision histories and can expose supposedly secure data.

Accidental Disclosure

What’s the worst that could happen if personal, financial, health, or legal data isn’t properly redacted? In 2014, the New York Times didn’t properly redact a sensitive PDF document, making it possible for interested parties to discover the name of an active NSA agent. Lacking redaction also exposed private financial data from federal PACER documents that used the blackout technique mentioned above.

More recently, copy-and-paste errors exposed redacted paragraphs across hundreds of documents in a high-profile legal case. According to Vice, lawyers asked about the failure report regular redaction issues. If blacked-out Word documents aren’t scanned and “flattened” as PDFs, recipients can move them around to expose sensitive data. What’s more, even the most aggressive use of permanent marker may not be enough to stop advanced optical character recognition.

Ramping Up Redaction

The solution to growing redaction issues in expanding clouds are cloud-native APIs that enable businesses to integrate redaction at scale. Accusoft’s redaction API streamlines the security process. Documents are scanned for text fragments matching specific rulesets, and new JSON documents containing redaction markups are created. These markups are then permanently “burned in.”
Developers can build redaction frameworks to suit specific security needs, and obtain the state and result of existing redaction processes on-demand.

The result? Secure redaction of social security numbers, phone numbers, email addresses, date, and any regular expressions, allowing documents to leave the confines of protected networks without compromising data integrity. Redaction in the cloud is now essential to safeguard critical data, but demands a customizable cloud API capable of secure data burn-in to provide total control of disclosure at a distance.

Learn more about Accusoft’s stand-alone APIs at