Facing stiff resistance and disconnected employees, IG programs can be implemented through novel use of automatic tagging, machine learning and novel search-engine capabilities.

If attorneys seem like the only ones aware of their organization’s information governance program, chances are it is because they were the ones who wrote it.

“A lot of information governance programs are policy-driven,” said Dana Simberkoff, chief compliance and risk officer at AvePoint. “And those kinds of policy-driven programs are typically authored in great part by lawyers.”

But as attorneys are at the helm of their organization’s information governance, while the policies may be legally sound, they may also be operationally unfeasible. In almost all industries, Simberkoff has seen how “people often responsible for writing the policies are not necessarily clear how the business is using data for the everyday job.”

The challenge facing attorneys then, whether in-house or at their law firms, is bringing these policies into action within a usually disconnected employee base. While such a feat sounds next to impossible, recent technological innovations have made such goals achievable, with minimum human support. Here are the top three innovations that can aid tackling this information governance challenge:

1. Automatic Tagging

When documents come into a network or computer system, tagging pertinent metadata early on can ensure file organization and proper document handling. But with the volume of data users have to handle on a daily basis, manual tagging can be a lost cause.

That is where automatic tagging comes into play. Having computer systems classify documents as soon as they arrive not only streamlines organization, but it also help users determine the information’s access rights as well.

Such tagging can tell “a lot of things about the document,” Simberkoff said. “It can determine the sensitivity, the nature of the document, whether it’s information that can be shared publicly or internally, if it’s confidential … [and] if it’s a record subject to a record management hold.”

In that sense, “tagging [and] identification of what the data is, and what the nature of the data is, is the pivot around which all of these data governance programs can be automated—and they can also tie to technical controls in the containers which the data exist,” she added.

2. Machine Learning

Automatic tagging, however, is a forward-looking process—it only works on future documents coming into the organization, and does little for the unorganized documents already on servers and repositories. Organizing back-documents, therefore, can create an entirely new challenge, but one that can be easily solved through the use of machine learning technology.

The foundation of e-discovery’s technology-assisted review (TAR), machine learning “learns to pick up patterns and similar types of content over time,” Simberkoff said. “If you say the following set of documents are sensitive, but you don’t tell the software why it’s sensitive, the software has the capability to go out, scan that, and look for the patterns to apply rules for future information that is exactly what you are talking about.”

However, no technology is perfect, Simberkoff cautioned, adding that one should be careful not to rely exclusively on machine learning. It is very important, she said, to have an “incident response built in the software that allows for audit and review in the result. So if there are inconsistencies, you have an audit trial of what the software is doing.”

3. Advanced Searching

For those reluctant to use machine learning, finding and organizing a backlog of data can still be achieved through more traditional technologies, such as search engines.

This can be useful, Simberkoff noted, when looking for specific information, like Social Security numbers, to meet well-defined compliance needs, like Health Insurance Portability and Accountability Act or GDPR requirements. But even for more high-level compliance needs, where organization depends upon the content and relationship of the documents more than a single set of numbers or phrases, search engines can also prove effective.

Search firm Concept Searching, for example, has developed “compound term processes” that “can identify and weigh multi-word concepts based on purely statistical analysis,” company president Martin Garland said. “Compound term processing understands the relationships between words, but is independent of vocabulary, grammatical style, and language.”

First heard through Legaltechnews.com.