What's the Difference Between PDF, DOCX, TXT and RTF files in eDiscovery?

16 August 2019 by Anith Mathai eDiscovery documents

Takeaway: Each of the 4 common types of text documents has its pros and cons. Word documents and PDFs are the most useful: Word documents for editing, and PDFs for sharing. And their differences pop up in these 3 phases of eDiscovery – (A) When you’re opening files, (B) When you’re searching them, and C) When you’re producing them.

What are ‘file extensions’?

With electronic discovery (eDiscovery), you’re dealing with different types of electronic data – text documents, emails, spreadsheets, audio files, video files, etc. And each file type needs different software, to open. So, Microsoft Word for text documents, Microsoft Excel for spreadsheets, VLC player for videos, etc. But how does your computer know which software to use for which file? It looks at the file extension – i.e., the series of letters that follow the file name (and which are separated from the name by a dot). So, an audio recording of your interview with a client might be named: interview.mp3. Here, the ‘.mp3’ at the end tells the computer that the file is an audio file, which means it’s opened by audio software – like Windows Media Player.

What are the common file extensions for text documents?

Here are the 4 different file types you’ll come across most often:

1. Plain text (TXT) files: The simplest type of document you’ll see

  • They have a ‘.txt’ file extension.
  • They store plain text – just letters, numbers, and symbols. So, no formatting, fonts, images, etc.
  • They’re lightning fast to open, save, and store. Which makes them perfect for jotting down quick notes. Or as ‘read me’ files that come along with new software. And programmers use TXT files to write and store source code.
  • You can open them with any text editing software, on any computer. Because, no tech company ‘owns’ them. So, you can open them on a Windows computer, with software like Notepad, Wordpad, or Microsoft Word. But you can open the same file on a Mac, too, with TextEdit. In fact, you can open TXT files on web browsers like Chrome and Firefox. As well as hardware devices like smartphones and eReaders (like the Amazon Kindle).

2. Rich Text Format (RTF) files: TXT files that you can format

RTFs are a level up from TXT files. And they’re more like regular word processing documents.

  • They have a ‘.rtf’ file extension.
  • You can format your text and insert simple images. So, change the font, increase the letter size, add paragraphs, align/justify your text, add bulleted lists, etc.
  • They’re easy to open, just like TXT files. The RTF format is ‘owned’ by Microsoft, but you can open them on any operating system (like Unix, Macintosh, and Windows).

3. Word Documents (DOCX): The ultimate in word processing

When you think ‘word processing,’ you think DOCX files.

  • They have a ‘.docx’ file extension. Microsoft began developing them in the 1980s. And back then, they had a ‘.doc’ extension.
  • They have every conceivable editing feature you’ll need. You can format and align your text, like with RTFs. But you can also insert complex images, tables, charts, headers/footers, hyperlinks, comments, references, watermarks, macros, and a host of other things. As well as check spellings, grammar, and track changes made to documents.
  • But they’re harder to open. Unlike TXTs and RTFs, you can’t open them with just any text editor. They’re a Microsoft product, and you’ll need to open them with Microsoft Word. There are hacks to work around this (e.g., Mac’s TextEdit can open DOCXs), but most aren’t ideal.

4. Portable Digital Format (PDF) files: For when you want to share documents

  • They have a ‘.pdf’ file extension. And the software company Adobe began developing them in the 90s.
  • Their layout and formatting stay the same, regardless of which device, operating system, or application you’re using. Which is why they’re perfect for sharing. In contrast, different versions of Microsoft Word usually have different default settings. So, your carefully formatted DOCXs often become unrecognizable when shuttled between computers.
  • They’re easy to open. Unlike DOCXs, they aren’t tied into a single application. So, you can open them with Adobe’s official Acrobat Reader, as well as many third-party applications and web browsers. Also, they’re well ‘compressed.’ so they don’t take up much space, which makes them easier to share.
  • But they’re hard to edit. So, for word processing, it’s better to stick with DOCXs. Or convert your PDF into a DOCX first and then edit it.
  • [Side note: When you scan paper documents, the scanned version is stored as a PDF.]

When does a document’s file format matter in eDiscovery?

Here are 3 times they’ll pop up:

A. Opening eDiscovery documents

Usually, you’d use different applications to open each of the document types we’ve just described. With some, like DOCXs, you have a limited choice – you want to stick with Microsoft Word. And for others like PDFs, you have more options. But you’re still choosing from a shortlist of applications. With eDiscovery software, though, you can open any type of document using the same eDiscovery application. This makes things so much simpler because you won’t be switching back and forth between Microsoft Word, Adobe Acrobat, and any other software you’re using to open files.

B. Searching your eDiscovery documents

Your eDiscovery search engine won’t have a problem with TXTs, RTFs, and DOCXs. But it’ll need a bit of help for PDFs. Especially, PDFs of paper documents you’ve scanned. Because these aren’t really text documents. Rather, they’re more like photographs of the originals. Which means your software can’t ‘read’ the text. So, you’ll need Optical Character Recognition (OCR) software to convert these ‘photographs’ into machine-readable and -editable text documents.

C. Producing your eDiscovery documents

When you’re producing files, you’ll need to decide if you’re producing them as ‘native’ files or as PDFs. [‘Native’ files are files kept in the same file format as when they were created.] These are the pros and cons of both:

  • With ‘native’ productions, you keep file metadata safe. But you’ll be opening these native files with their parent software. So, for example, to open a DOCX file, whoever is reading your document needs to have Microsoft Word installed.
  • With PDF productions, you’re not tied into a particular application (any easily-available and free 3rd-party PDF reader will do). But you’ll lose the valuable metadata we just talked about.

Want eDiscovery software that can open, search, and produce any document? Try GoldFynch.

It’s a next-generation eDiscovery application that’s affordable and easy-to-use.

  • It costs just $27 a month for a 3 GB case: That’s much less – every month – than the nearest comparable software. And hundreds of dollars less than many others. With GoldFynch, you know what you’re paying for exactly – its pricing is simple and readily available on the website.
  • It’s easy to budget for. GoldFynch has a flat, prorated rate. With legacy software, your bill changes depending on how much data you use.
  • It takes just minutes to get going. It runs in the Cloud, so you use it through your web browser (Google Chrome recommended). No installation. No sales calls or emails. Plus, you get a free, fully-functional trial case (0.5 GB of data and a processing cap of 1 GB), without adding a credit card.
  • It can handle even the largest cases. GoldFynch scales from small to large, since it’s in the Cloud. So, choose from a range of case sizes (3 GB to 150 GB, and more) and don’t waste money on space you don’t need.
  • You can access it from anywhere. And 24/7. All your files are backed up and secure in the Cloud. And you can monitor its servers here.
  • You won’t have to worry about technical stuff. It’s designed, developed and run by the same team. So, its technical support isn’t outsourced. Which means you get prompt and reliable service.

Want to learn more about GoldFynch?