Why Is eDiscovery Impossible Without Data Filtering?

17 March 2022 by UV eDiscovery filtering

Takeaway: Data filtering is how your eDiscovery software sorts through hundreds of GBs of data to find core responsive files. And eDiscovery would be impossible without this quick way of culling non-essential data. That’s why search engines – the most powerful filtering tool – are so important. And it’s why, when choosing an eDiscovery application, you need to test its search engine first.

There’s always too much eDiscovery data, however smart we are about collecting it.

Digital data is so easy to store that we can easily accumulate hundreds of thousands of files we may never use. But with eDiscovery, every GB of data you upload has to be processed. And that can often inflate your monthly bill. This is why data retention policies are so useful because they limit how much data gets stored. And eDiscovery professionals are just as smart about data collection – optimizing where and from whom they gather data. But even after all this preemptive culling, it’d still take you months (or years) to read through every single file in your case manually.

That’s why eDiscovery software developers look for ways to filter out unnecessary data.

Your digital files can be filtered based on a few key characteristics. For example, there’s file metadata – which is like a digital footprint recording the history of a file or email (e.g., when it was created and who created it). Then there are ‘file types,’ which represent the different ways bits and bytes get encoded to form a readable document (e.g., a Microsoft Word document Vs a PST Vs a PDF). And finally, we can filter files based on their custodians – i.e., the person, group, or company that created or owns a document.

At the heart of all data filtering techniques lies the keyword search: You type a keyword into your eDiscovery software’s search bar, and then hit ‘enter.’ And it’s not just keywords you can search for. You’ll filter data based on file types, file names, email addresses, dates, times, Bates numbers, and more. For example, you could search for the ‘Levinson merger’ and specify that you’re looking for only emails sent in October 2019 by the custodian ‘Janet Levinson.’ This filters data based on file type (PDFs/Word documents are excluded), metadata (emails with a creation date other than ‘October 2019’ are excluded), and custodian (custodians other than ‘Janet Levinson’ are excluded). Learn more about eDiscovery searches.

But there’s a challenge here. We need search engines to go beyond keywords and recognize search intent.

Every eDiscovery search begins with a keyword. But we’re looking for more than just that keyword. We’re looking for the valuable chunks of data that the keyword is buried in. So, if we type in the word ‘acquitted,’ we’re really looking for documents and emails discussing, say, the acquittal of a primary custodian. And we assume the search engine will also pull up keyword variations like ‘acquits,’ ‘acquitting,’ ‘acquittal,’ and more.

That’s why the best eDiscovery applications have tools to help search engines be more flexible.

It’s hard to gauge user intent, so software developers focus on making their search engines more flexible instead. For example, the best search engines use a concept called stemming – where they trim a keyword to its root (i.e., ‘stem’) and then search for variations of this stem. So ‘acquitted’ becomes ‘acquit’ and its variations. (Learn more about stemming.) Similarly, they use slop searches to find keyword combinations where the keywords aren’t right next to each other. I.e., they’d find ‘Janet Levinson acquittal’ even if you typed in ‘Janet acquittal’). (Learn more about slop searches.) And finally, they run fuzzy searches to correct for typos, e.g., finding ‘acquitted’ even if you type ‘aquited.’

So, when trying out new eDiscovery applications, spend a bit of time assessing their data filtering options.

The best search engines are easy to use. So, although you’ll build complex searches linking keywords and metadata, you’ll do it using options from drop-down menus. And you’ll be able to save these searches as you build them. That way, you (or your team) can improve or reuse an existing search without having to start from scratch. Also, remember that searching is just one way of filtering data. The latest eDiscovery applications also let you tag similar files – sort of like sticking a post-it note on a paper document. Tags make it easy to label your search results and pull them up again with a single click. (Learn more about eDiscovery tags.)

Looking for reliable eDiscovery software to help filter data? Try GoldFynch.

GoldFynch is an eDiscovery subscription service designed for small and midsize law firms.

  • It costs just $25 a month for a 3 GB case: That’s significantly less than most comparable software. With GoldFynch, you know what you’re paying for exactly – its pricing is simple and readily available on the website.
  • It’s easy to budget for. GoldFynch charges only for storage (processing files is free). So, choose from a range of plans (3 GB to 150+ GB) and know up-front how much you’ll be paying. You can upload and cull as much data as you want, as long as you stay below your storage limit. And even if you do cross the limit, you can upgrade your plan with just a few clicks. Also, billing is prorated – so you’ll pay only for the time you spend on any given plan. With legacy software, pricing is much less predictable.
  • It takes just minutes to get going. GoldFynch runs in the Cloud, so you use it through your web browser (Google Chrome recommended). No installation. No sales calls or emails. Plus, you get a free trial case (0.5 GB of data and a processing cap of 1 GB) without adding a credit card.
  • It’s simple to use. Many eDiscovery applications take hours to master. GoldFynch takes minutes. It handles a lot of complex processing in the background, but what you see is minimal and intuitive. Just drag-and-drop your files into GoldFynch, and you’re good to go. Plus, you get prompt and reliable tech support.
  • Access it from anywhere, and 24/7. All your files are backed up and secure in the Cloud.

Want to find out more about GoldFynch?