You might have heard about “mining for metadata.”  What does it mean?  How is it done?  Should you do it?

When the origins of a critical document are at issue, you know to examine that document’s metadata. “Mining for metadata” is something else. It’s the process of using software to systematically search through the metadata embedded in a collection of documents. Since metadata sometimes can determine or affect the outcome of a case, it’s important to know how and when you may mine an opponent’s metadata, and how you can protect yourself from those who may want to mine yours. 

Mining Your Opponent’s Metadata 

Think of how the question may arise. Typically, sophisticated litigants agree in advance on the format of the electronically stored information to be produced, and what metadata will be included in the production. Tech-savvy attorneys usually insist on producing most, if not all, documents as TIFF images of the original files. This prevents opposing counsel from searching through the metadata, except for what is intentionally disclosed in addition to the TIFF files. Sometimes opposing counsel will agree to produce native versions of documents without considering the importance of the embedded metadata. This is more likely to occur with files that are ordinarily produced in native format, such as Excel and PowerPoint, because they are more difficult to produce and review in imaged format. Other times, a judge or arbitrator may order a litigant to produce files in native format because an opposing litigant has successfully argued that the metadata obtainable from the native version may contain discoverable information. In such scenarios, mining for metadata may yield great benefits.

By using the appropriate software, your in-house or outside counsel can review caches of metadata catalogued by various properties, such as author and date, and can retrieve additional metadata that might otherwise remain buried within the documents. This is particularly helpful if any of the documents contain unique or unusual metadata, such as hidden comments or redlines, which can flag potential issues or relevant information.

Consider the following three scenarios and cases:

  • In a trade secrets case, you need to need to prove that your former employee possessed or used a confidential document with your proprietary information at the time in dispute. Venable recently obtained a $20-million-plus judgment in such a case by using metadata to prove that an illicit competitor used our client’s proprietary information in preparing a proposal for a lucrative government contract.
  • Your opponent cannot produce relevant documents or the corresponding metadata, and you want to pursue a spoliation claim. An administrative law judge recently awarded a trade secrets plaintiff a default judgment as well as costs and fees worth more than $1.9 million because the metadata proved that the defendant tampered with and destroyed relevant metadata.
  • You suspect that opposing counsel fraudulently modified a critical document, and you want to expose their misrepresentation. Earlier this year, a whistleblower won nearly $11 million in a Dodd-Frank retaliation claim, in part because he was able to prove—using metadata—that his employer had fraudulently backdated a performance review.

These are just a few examples of how to use metadata against an opponent. Counsel looking to use metadata in a similar fashion should be aware of possible legal and ethical limitations, which differ by state. Some jurisdictions, like New York and Washington, DC, generally do not permit attorneys to review metadata contained in documents from an adverse party until after consulting with opposing counsel to determine if the metadata includes privileged or confidential information. Other states, like Maryland and Colorado, generally do permit attorneys to review such metadata prior to ascertaining if the sender intended to include it (unless he or she has actual knowledge the transmission was inadvertent). Still other jurisdictions, like Pennsylvania and Minnesota, have adopted a case-by-case standard that allows review of the metadata under some, but not all, circumstances. Although the ABA takes the position that an attorney can ethically review metadata embedded in documents received from an adverse party, the attorney is still obligated to notify that adverse party if he or she knows or reasonably should know that the transmission of the metadata was inadvertent.

Protecting Your Metadata From Being Mined 

Prudent lawyers and litigants protect against mining. Before, during, and after discovery, they take care to avoid disclosing privileged, confidential, or irrelevant metadata to the other side. Some metadata may be protected by attorney-client privilege, or by the work product doctrine; as a result, under certain circumstances disclosing it might be deemed a waiver of either or both. Even if not a waiver, an inadvertent disclosure of metadata may simply alert opposing counsel to important facts or issues. This is why many companies now use “scrubbing” or “purging” software that removes a document’s metadata before sharing it with others.

But be careful. If used improperly or in the wrong circumstance, metadata “purging” software may do more harm than good. Once a company reasonably anticipates litigation, it must preserve relevant documents and information—including metadata. As a result, “purging” metadata may constitute the destruction of evidence—spoliation—resulting in sanctions, including adverse legal inferences by the court and an award of attorney’s fees and costs.

If you anticipate litigation, at a minimum you should consult counsel before using a commercial product such as DriveScrubber, CCleaner, or Eraser to remove metadata. If you use software that routinely “purges” metadata, such as email client add-ins that automatically remove metadata from attachments just before they are sent, you should turn off or otherwise manage that software to preserve relevant metadata.

Kevin Yost, Project Manager, Practice Technologies, assisted with the preparation of this blog.