Don't Accept 'Bad' eDiscovery Productions [7 Things to Consider]

25 April 2021 by Ross eDiscovery production

Takeaway: When receiving productions, remember to consider the following: (1) Production formats, (2) Filenames, (3) File structure, (4) Non-native files, (5) Emails, (6) Load files, (7) Metadata.

If you’re thinking about requesting an eDiscovery production, there are a few technical things to keep in mind.

The thing about eDiscovery is that you’re using software to review data. And there are so many eDiscovery applications out there that unless you’re clear about some of the technical specifics, you could end up with unusable, ‘bad’ data. And this could cost you a lot to fix (for example, you might need to bring in expensive eDiscovery professionals.) So, think through the following when you’re requesting a production.

1. What production format do you want?

You’ll usually have 4 options to choose from:

  1. Native productions: Here, you keep files in their ‘native’ (i.e., original) format. So, you’ll keep Microsoft Word files as files with the ‘.DOCX’ (Microsoft Word) file extension. You’ll keep Excel files as files with the ‘.XLSX’ extension. And so on. This way, you’ll be able to see edits like comments, ‘track changes,’ etc. Plus, you’ll protect important file metadata. (Learn more about native files.)
  2. PDF productions: The Portable Document Format (PDF) was created by Adobe in 1993. With native productions, you need specific software to open each file type. E.g., Microsoft Word for Word documents and Notepad for TXT files. With PDFs, though, all files are converted into a common PDF format. So the only software you’ll need is Adobe Acrobat Reader.
  3. TIFF productions: The Tagged Image File Format (TIFF) was developed in the mid-1980s by the Aldus Corporation. And they are the original version of the ‘standardized file format’ concept that PDF now represents. TIFFs were revolutionary decades ago, but nowadays, PDFs are considered to be better.
  4. A mix of native & PDFs/TIFFs: Most eDiscovery applications let you create a mixed production. So, you can produce some file types as PDFs & TIFFs and others as native files. E.g., Produce all word documents as PDFs, but Excel and TXT files as natives. If you have a choice, insist on native productions. If not, PDFs, and lastly, TIFFs. (How to decide which production format to use.)

2. Will you have a system for naming files?

The most common convention is to add a prefix to each file’s Bates number and use that combination as the filename. The exception, though, is with file families – where you won’t produce/number the child files separately. Instead, you’ll want to name them using their parent’s Bates number, making it easier to keep parent and child files together. Note: File families are groups of associated files. The primary file in this group is called the ‘parent’, and the others, the ‘children’. For example, emails would be parent files and their attachments, the children. (Learn more about file families.)

3. Will you have a system for organizing files and folders?

Each production will have many folders and subfolders, so you’ll want to specify a system for organizing them. For example, if opposing counsel is scanning paper documents, you’ll want these scanned PDFs to be placed together. And you’ll specify this – perhaps asking them to be put in a folder titled ‘NATIVE’ or something similar.

4. Are there non-native files? If so, have they been rasterized?

When converting files out of their native formats into PDFs and TIFFs, you’ll need to make sure they’re rasterized. **‘Raster’ is a format that builds images around pixels, which makes sure they look the same regardless of which computer or software opens them. This is critical in eDiscovery because you don’t want your file contents to get distorted as they’re being shared. Most PDFs and TIFFs are rasterized automatically, but not all of them. For example, PDFs made using Computer-Aided Design (CAD) – used by designers, architects and engineers – aren’t rasterized.

5. In what format do you want your emails?

Email lies at the heart of most businesses, so it’s worth considering which types of email formats you’re willing to accept.

Ideally, you’ll want emails in a ‘bulk’ format, ready to export or archive.

For example, if the emails are from Outlook, you’ll want them as PSTs (which is how Outlook stores groups of emails) or OSTs (i.e., how Outlook stores groups of emails when you’re offline). Another file format you might accept is MBOX, which is how Gmail archives and exports emails. Whatever the format, make sure that each archive has the emails of only a single mailbox from a single user. That’s because an entire archive gets just one Bates number, so it’s tricky to track emails if you have multiple users/mailboxes mixed together.

Only if these bulk archives aren’t available should you accept individual emails.

Single emails from Outlook are usually in the MSG or EML format. MSGs are created automatically when you drag and drop an email from Outlook into a folder on your computer. EMLs are similar except that you won’t need Outlook to open them, so you can use them on other email applications like Mozilla Thunderbird and Outlook Express (the free, stripped-down version of Outlook). When you’re receiving individual emails (instead of archives), you’ll be giving each of them their own Bates number.

6. What ‘load file’ details do you need?

Next-generation eDiscovery applications don’t need load files, but productions from legacy software will. Here’s what that means:

A load file is a carefully structured text file that helps slot eDiscovery data into behind-the-scenes databases.

Data is usually disorganized and in multiple formats (PDF, TIFF, MS Office, emails, etc.). But computers can’t make sense of this chaotic data. Instead, they need to first put it into specific slots in a rigidly structured, behind-the-scenes database. And just as you can’t put a square peg into a round hole, only corresponding pieces of data can enter each database field. Once in the database, your PDFs, TIFFs, etc., get fully processed and ready to review. And load files help make all this happen. (Learn more about load files.)

Most eDiscovery applications prefer certain load file formats.

For example, GoldFynch takes DAT, CSV, and JSON formats only. And it’ll need key load file fields to be filled in. E.g., the ‘OS’ column in the load file should list the operating system the file is coming from (e.g., Windows 10). And the ‘Custodian’ field should list the owner of the device the file is coming from.

7. What metadata do you need?

Just as load files have database fields, regular files have ‘metadata’ fields.

Metadata is a digital footprint that tracks the history of a document.

When you create a document on your computer, the application you’re using (e.g. Microsoft Word) records a bunch of information about it. Things like who created the document, when they created it, when the document was last opened, etc. This information is called metadata, and you can think of it as a digital footprint that tracks the history of a document. The right kind of metadata can help you win cases. For example, say, ‘Andrew’ claims that he had access to a computer with company secrets, but didn’t copy and leak the information to a competitor. However, when you look at the computer’s file-system metadata, you find something interesting. Andrew accessed the computer at 15:33 and logged off at 16:05. But at 15:42, a flash drive was plugged in and was unplugged at 15:58. Plus, there’s a log recording that some files were transferred. That’s more than enough circumstantial evidence on which to start building a case. (Note: Metadata will also help clarify the parent-child connections we discussed earlier.)

So, to make full use of your eDiscovery tools, you’ll need to insist that key metadata fields have been filled in.

For regular files, that might include things like ‘date created’, ‘last date modified’, ‘created by’, etc. And for emails, you’ll want details for fields like ‘sent from’, ‘sent to’, ‘cc’, ‘bcc’, etc.

If you’d like a more technical look at the things we’ve covered in this post, GoldFynch can help.

For a more detailed version of what to ask for when receiving productions, check out GoldFynch’s checklist.

Want to make sure your eDiscovery productions are handled right? Try GoldFynch.

GoldFynch is an eDiscovery service that is perfect for small- and midsize law firms and companies. It’s great with file archives and has other things going for it too.

  • It costs just $25 a month for a 3 GB case: That’s significantly less than most comparable software. With GoldFynch, you know what you’re paying for exactly – its pricing is simple and readily available on the website. (Note: You’ll get a free 512 MB trial case to sample, first)
  • It’s easy to budget for. GoldFynch charges only for storage (processing is free). So, choose from a range of plans (3 GB to 150+ GB) and know up front how much you’ll be paying. It takes just a few clicks to move from one plan to another, and billing is prorated – so you’ll pay only for the time you spend on any given plan. With legacy software, pricing is much less predictable.
  • It’s simple to use. Many eDiscovery applications take hours to master. GoldFynch takes minutes. It handles a lot of complex processing in the background, but what you see is minimal and intuitive. Just drag-and-drop your files into GoldFynch and you’re good to go. Plus, it’s designed, developed, and run by the same team. So you get prompt and reliable tech support.
  • It keeps you flexible. To build a defensible case, you need to be able to add and delete files freely. Many applications charge to process each file you upload, so you’ll be reluctant to let your case organically shrink and grow. And this stifles you. With GoldFynch, you get unlimited processing for free. So, on a 3 GB plan, you could add and delete 5 GB of data at no extra cost – as long as there’s only 3 GB in your case at any point. And if you do cross 3 GB, your plan upgrades automatically and you’ll be charged for only the time spent on each plan. That’s the beauty of prorated pricing.
  • Access it from anywhere. And 24/7. All your files are backed up and secure in the Cloud.

Want to learn more about GoldFynch?