What's So Great About Native Files in eDiscovery?

06 December 2020 by UV ediscovery native

Takeaway: Native files offer much more than their eDiscovery alternatives (i.e., TIFFs and PDFs). They’re the rawest and most accurate form of your case’s data, they take up less space (making them easier to host), and it’s simpler to spot if they’ve been tampered with. So, where possible, insist on getting native files instead of TIFFs and PDFs.

What are ‘native’ files? They’re the original versions of all your case files.

Each application you use creates files in a particular ‘format.’ Which means it structures the file’s data in a specific way. So, for example, Microsoft Word creates DOCX files (a file with a ‘.docx’ extension). That extension is unique to Word. And it tells your computer to use Word to open the file. This is the ‘native’ format of the file. I.e., the format in which it was originally created.

If not ‘native,’ then what? Well, you can choose to convert native files to PDFs or TIFFs.

Sometimes native files aren’t an option. For example, opposing counsel might only have access to non-native files. Or you might not have the parent software of a native file – meaning you can’t open it. Or you have paper documents that need to be converted out of their native ‘paper’ format into an electronic format. In these instances, there’s a workaround. You’ll convert native files into TIFFs or PDFs. And these newly created files are called ‘derived’ files.

1. TIFFs (Tagged Image File Format) are like screenshots of each page of a document.

So, they’re pictures of your data rather than the original data. For example, with an original Word document, your software can search the data for keywords, make edits, add comments, etc. I.e., the Word file is fully functional. But a TIFF is like a photograph of each page. You can read the text and look at the images, but you can’t search or edit the file. TIFFs were created in the mid-1980s by the Aldus Corporation as a file format to store scanned images. And it was revolutionary because TIFFs are standardized files you can open on any computer. Most operating systems come with a TIFF viewer but the TIFF format hasn’t been updated since 1992.

2. PDFs (Portable Document Format) are a newer version of TIFFs.

They were created more recently (by Adobe in 1993), are updated regularly, and give you more options than TIFFs. So, if you have to choose between PDFs or TIFFs in eDiscovery, always go with PDFs. (Learn more about why PDFs are better than TIFFs.)

So, what’s so great about native files? Why are they better than PDFs and much better than TIFFs?

1. Natives come with valuable ‘metadata’

‘Metadata’ is a game-changing feature that eDiscovery can harness. When you create a document on your computer, the software you’re using (let’s take Microsoft Word again) records a bunch of information about it. Things like who created it, when they created it, when it was last opened, etc. This ‘data about data’ (i.e., metadata) is a digital footprint which tracks the history of the document. And it gives you valuable information that you probably didn’t know exists. The problem is that converting a file out of its native form deletes all this metadata. And that disables you significantly. Learn more about metadata.

2. Day-to-day, everyone works with native files – not PDFs and TIFFs

When your clients work on a document, they’re using its native version. With Word (and other word processing applications), this means they’re doing things like ‘tracking changes’ and adding comments. With email, they’re creating conversation threads, labeling and categorizing their inboxes, creating archives, and so on. So, native files are constantly changing, real-world phenomena. In contrast, PDFs and TIFFs are static. They’re replicas of original files. You can breathe some life into them (e.g., using OCR to convert ‘pictures of text’ into machine-readable, searchable text), but it’s not the same. And so they put you at a disadvantage.

3. Native files give you more conversion options

You can convert native files to TIFFs or PDFs (if, for example, opposing counsel asks for a PDF production). But you can’t convert PDFs and TIFFs back to their native format because the ‘damage’ has already been done.

4. Native files are easier (and cheaper) to host and load

TIFFs take up much more space than native files – which can triple your eDiscovery costs. Even lower-resolution TIFFs take up more space than a native file, without nearly as much original information. TIFFs try to make up by using load files to hold on to certain metadata, but it’s an inefficient process. So, with a TIFF production, you’ll have an ‘image’ file (i.e., the TIFF), the OCR’d file with searchable text and the load file. Three files to replace a single native? Quite the lose-lose situation. Even with low-cost Cloud eDiscovery to cut storage costs, it’s still a waste of space. And remember, larger TIFF productions take longer to load, process, and share.

5. It’s tougher (in the long run) to tamper with a native file

Some people argue that since TIFFs and PDFs are a snapshot of the native, you can’t tamper with them. For example, you can easily edit a native Word document just by opening it in Microsoft Word (which everyone has access to). But the fact is that you can edit a TIFF. After all, they’re just black-and-white photos – so you can edit them and change Bates numbers in Photoshop or even PowerPoint. Thankfully, we can use hashing to catch these fakes. That’s where each document is given a hash value (i.e., a unique string of numbers and letters – e.g., c262935cfc3af3f0a5cc5c5cdf3c0b26) to help identify it. So, when you change a file, its hash value changes. [In general, though, faked TIFFs are harder to catch than faked natives.]

So, what’s the takeaway? Always ask for native files. They give you more options and are just as easy to produce than TIFFs and PDFs. And if you have no choice, choose PDFs over TIFFs.

Opposing counsel will often hand over some files as TIFFs (e.g., Word documents and emails) but others as natives (e.g., PowerPoint files and Excel spreadsheets). But it’s often just as easy to hand over everything as natives. (Word documents are part of the Microsoft Office family, so if they can give you Powerpoints as natives, why not Word files?) TIFFs were fine when we were using paper documents because both gave us about the same amount of information. But natives trump TIFFs and PDFs significantly, so it’s worth pushing for them when possible. Only if that doesn’t work should you accept a PDF production. And TIFFs should be your last resort.

