Are You Mistakenly Destroying eDiscovery Metadata? Here's the Solution.
Takeaway: Metadata is a valuable digital footprint that tracks the history of a document. But it’s easy to alter it mistakenly. The trick is to understand how it works and find affordable eDiscovery software that can protect it.
When you create a document on your computer, the application you’re using (e.g. Microsoft Word) records a bunch of information about it. This information is called metadata.
Think of metadata as a digital footprint that tracks the history of a document. The content of a document is its ‘data’, but a description of the content is the ‘metadata’ (i.e., data about data). Metadata tells you things like who created the document, when they created it, when the document was last opened, etc. And metadata is everywhere, attached to any digital content. That means electronic documents (Word documents, spreadsheets, PDFs, etc.), emails, social media, mobile devices, photographs, videos, web pages, and more.
This metadata can give you valuable bonus information that you won’t find in the file’s contents.
For example, say a partner at a large law firm sends your client a document he’s worked on. Out of curiosity she checks the document’s metadata and finds it was an associate – not the partner – who worked on the document. And yet she was billed at a partner’s rate. The metadata has caught the partner red-handed!
Most of the time, though, you won’t spot metadata unless you know where to look.
We have access to so much metadata but usually won’t know where it’s all stored.
- There’s metadata embedded in documents. This can be metadata that the software creates (things like fonts, spacing, color, etc.), and metadata that the user creates (spreadsheet formulas, hyperlinks, and references). When you move files around, this sort of metadata moves along with them.
- There’s also metadata that isn’t embedded in documents. This ‘system’ metadata is stored separately from the file (for example, on a server), and could include things like filenames and locations, creation dates, profile information, user IDs, etc.
The main challenge with metadata is that it gets altered and destroyed easily.
Metadata is valuable but extremely fragile. For example, here are some ways you can destroy or change it:
- Opening a file to review its contents. Here, the ‘last viewed’ metadata field changes.
- Copying files to another computer. There’s a metadata record of when files are created. And creating a copy of a file gives it a new creation time. So, it’s not really the same file anymore. (Yes, this happens even when you do a simple drag-and-drop copy.)
- Moving files around. Just rearranging folders can get you in trouble because ‘cutting and pasting’ a file into a new location will likely change its ‘last accessed’ and ‘last modified’ metadata entries.
- Running a virus scan. Anti-virus software needs to handle a file to scan it. And this changes the ‘last accessed’ metadata field. Remember, this can happen even without you choosing to run the scan, because many computers are set up to scan files automatically.
In the middle of a busy week, it’s quite easy to forget all the subtleties that go into preserving metadata.
And it’s even more frustrating when your clients aren’t careful. For example, say a client’s IT team is gathering emails for you. They might go into their Exchange server and export the relevant PST files. But as soon as they do this, they’re changing the ‘created’ and ‘modified’ dates of those PSTs. And if you don’t realize this, you might assume those dates are accurate, when really they’re telling you more about the IT team’s activity than your client’s.
Sure, you can try simple hacks, but it gets a bit risky.
One way of preserving metadata when moving files around is to first put them into a ‘container’ file (like a ZIP file). That way, the ZIP’s metadata gets changed but its contents remain untouched. Still, it’s easy to mess up when you’re not a computer wiz. For example, one eDiscovery expert describes how he tried to burn valuable files onto a read-only CD, hoping to ‘freeze’ the metadata. But he didn’t realize that CDs and hard drives are formatted differently. So his modified/accessed/created dates were merged into one on the CD! We don’t use CDs much anymore, but the same formatting differences still occur between operating systems and different versions of applications.
Instead of trying to wing it, here’s a more systematic approach you could try.
1. Learn about the metadata fields you most need to protect.
Some fields are used more often than others, so it’s worth paying special attention to them.
- With system metadata, make sure to protect the following fields: Custodian, source device, originating path, filename, ‘last modified’ date and time. Imagine yourself having to give some basic information about a file in your case. Will you be able to say something along the lines of, “I got the accident.docx file from the ‘Email attachments’ folder on Dennis Nedry’s Dell laptop. And the file was last modified on December 6, 2019, at 3:14 PM CST.”?
- For email metadata, these are the fields to protect: Custodian, to, from, CC, BCC, date and time sent, subject, date and time received, attachment names, mail folder path, and message ID.
- For productions, you’ll need to preserve metadata created by the eDiscovery process. These include Bates numbers and ranges, hash values, extracted OCR text, production paths and names, and file family designations.
2. Always keep unaltered copies of the files you work with.
Try to store unused versions of your files somewhere safe, and use only working copies. That way, even if important metadata is altered, you have the original versions as a backup. And you can prove the originals are uncorrupted by referencing their hash values.
3. Keep files in their native format as far as possible.
It’s hard to preserve metadata when you switch between file formats. For example, when you convert a Word document into a PDF, you might lose the ‘track changes’ notes. And even if they transfer, it’s still different from how the original user saw the document. Remember, with some files it’s almost impossible to change formats without losing metadata (e.g., Excel sheets, video clips, voice mail, etc.).
4. Have a clear plan before your start collecting data.
You’ll want to set up litigation holds and detailed metadata-preservation plans well in advance.
- Use litigation hold notices to address the issue of metadata.
- Specify the metadata you want, how it can be preserved, and what the chain of custody will be. If you’re not explicit about this, opposing counsel could claim spoliation and ask you to produce the files again. This would mean a lot of wasted time and money. Make it a point to discuss metadata preservation and production with opposing counsel well before the meet and confer.
- Outline plans for dealing with privileged information if metadata mistakenly reveals it.
- Don’t forget to discuss the costs of all this, up front.
5. Choose the right eDiscovery software.
It’s tricky to try and protect metadata if you’re not an experienced veteran. So, what most law firms and companies do is find metadata-friendly eDiscovery software. With the latest generation of eDiscovery applications, you drag-and-drop your files into the software and it takes over metadata protection up to your final production!
Looking for metadata-friendly eDiscovery software? Try GoldFynch.
It’s an easy-to-use eDiscovery service that’s perfect for small- and midsize law firms and companies.
- It costs just $27 a month for a 3 GB case: That’s significantly less than most comparable software. With GoldFynch, you know what you’re paying for exactly – its pricing is simple and readily available on the website.
- It’s easy to budget for. GoldFynch charges only for storage (processing is free). So, choose from a range of plans (3 GB to 150+ GB) and know up front how much you’ll be paying. It takes just a few clicks to move from one plan to another, and billing is prorated – so you’ll pay only for the time you spend on any given plan. With legacy software, pricing is much less predictable.
- It’s simple to use. Many eDiscovery applications take hours to master. GoldFynch takes minutes. It handles a lot of complex processing in the background, but what you see is minimal and intuitive. Just drag-and-drop your files into GoldFynch and you’re good to go. Plus, it’s designed, developed, and run by the same team. So you get prompt and reliable tech support.
- It keeps you flexible. To build a defensible case, you need to be able to add and delete files freely. Many applications charge to process each file you upload, so you’ll be reluctant to let your case organically shrink and grow. And this stifles you. With GoldFynch, you get unlimited processing for free. So, on a 3 GB plan, you could add and delete 5 GB of data at no extra cost – as long as there’s only 3 GB in your case at any point. And if you do cross 3 GB, your plan upgrades automatically and you’ll be charged for only the time spent on each plan. That’s the beauty of prorated pricing.
- Access it from anywhere. And 24/7. All your files are backed up and secure in the Cloud.
Want to learn more about GoldFynch?
For related posts about eDiscovery, check out the following links.
- Why Your eDiscovery Software Should Offer Automatic Case-Upgrades
- The Smart Way to Free Up eDiscovery Storage Space
- Is It Worth Paying for eDiscovery Analytics?
- Small Case Vs Big Case eDiscovery: There’s Such a Difference!
- eDiscovery Pricing Comparison for Smaller, In-House Cases
- How to Use eDiscovery ‘Tag’ Macros For Lightning-Quick Work!
- How to Find Out When a Document or Web Page was Created
- eDiscovery Glossary: Essential terms you will need to know
- What’s the Difference Between PDF, DOCX, TXT and RTF files in eDiscovery?
- 5 Annoying eDiscovery Problems You Can Solve with the Right Software
- Is Social Media the Future of eDiscovery?