How To Manage Large eDiscovery Data Sets [Hint: Software Matters]

28 September 2023 by Ross eDiscovery dataset software

Takeaway: With large data sets, you want to inventory all your files, cull the unnecessary ones, and limit who has access to which files. And all this becomes easier if you have the right software.

The only way to tackle large eDiscovery data sets is to use a systematic review process.

Large eDiscovery data sets make reviews way more complicated, and we need to use a strategic approach when tackling them. Note that this applies to legal teams, IT professionals, governance experts, and/or anyone else involved in eDiscovery. So, how are we supposed to handle the massive volumes of data, diverse file types, and varied data sources? Well, here’s a four-step process to follow.

1. Create a detailed file inventory – identifying data sources, types, and volumes.

The foundational step in managing any large eDiscovery data set is creating a robust inventory. It’s essential to identify data sources, ranging from corporate servers, cloud storage platforms, and databases to individual email accounts and employee mobile devices. The next critical element is to recognize the types of data these sources contain, like text documents, emails, spreadsheets, and multimedia files. And you’ll want to explore the volume of data in each category so you can plan and allocate resources better. But remember, all this is more than just an auditing exercise because it’ll help prioritize data subsets most likely to contain relevant information.

For instance, a data inventory could transform a financial institution’s compliance audit.

Take, for example, a large financial institution facing a regulatory compliance audit. The legal team can use an automated tool to map out terabytes of data across various departments. This detailed inventory would help the team estimate the time, workforce, and computer resources required to process the data – saving both time and money in the long run.

2. Cull unnecessary files using search filters and data analytics.

Once the inventory is in place, reduce the data set’s size by culling files that are not relevant to the case. Start by using keyword searches but, when possible, use more advanced tools like Boolean queries and data analytics. These technologies can help rapidly identify and isolate irrelevant files, significantly reducing how much you’ll need to review manually. For instance, imagine you have an antitrust litigation case coming up with multiple terabytes of internal communications to review. By using advanced data analytics, you could reduce the data set by 60%, focusing only on relevant communications between specific individuals within the organization.

3. Create user permissions with role-based access.

The more data you have, the more vulnerable they are to hacking – and you never want to compromise data security and integrity. A role-based access control system ensures that only authorized individuals can view or modify specific data sets. (For example, you might create different ‘roles’ for paralegals, junior attorneys, and senior attorneys, each with varying levels of data access permissions.) You’ll also want to leave a reliable audit trail using software that automatically tracks and logs all data-related activity. The software will record all activities performed on the data, including who accessed it, when, and what changes were made. Finally, you’ll want to prevent accidentally modifying or deleting data using ‘version control’ systems. They’d store all document versions, enabling you to revert to previous versions when necessary – something that’s particularly important when you have multiple team members working on a case.

4. Choose flexible software that’s scalable and runs in the cloud.

The final piece of the puzzle is selecting the right eDiscovery software. And choosing a cloud-based option will give you several advantages. For example, these platforms run on cloud server farms that are inherently scalable. So you can adapt to fluctuating data volumes and processing demands without upgrading your hardware. Plus, cloud-based architecture also ensures that you can access your data from anywhere in the world, 24/7 – which is a huge advantage for distributed teams.

Most importantly, you don’t have to break the bank to handle large data sets.

Next-generation cloud eDiscovery services can offer you essential large-data-set review tools at an affordable price. For instance, we’ve found that small to midsize law firms love GoldFynch because:

  • It costs just $27 a month for a 3 GB case: That’s significantly less than most comparable software. With GoldFynch, you know exactly what you’re paying for: its pricing is simple and readily available on the website.
  • It’s easy to budget for. GoldFynch charges only for storage (processing files is free). So, choose from a range of plans (3 GB to 150+ GB) and know up-front how much you’ll be paying. You can upload and cull as much data as you want as long as you stay below your storage limit. And even if you do cross the limit, you can upgrade your plan with just a few clicks. Also, billing is prorated – so you’ll pay only for the time you spend on any given plan. With legacy software, pricing is much less predictable.
  • It takes just minutes to get going. GoldFynch runs in the Cloud, so you use it through your web browser (Google Chrome recommended). No installation. No sales calls or emails. Plus, you get a free trial case (0.5 GB of data and a processing cap of 1 GB) without adding a credit card.
  • It’s simple to use. Many eDiscovery applications take hours to master. GoldFynch takes minutes. It handles a lot of complex processing in the background, but what you see is minimal and intuitive. Just drag-and-drop your files into GoldFynch, and you’re good to go. Plus, you get prompt and reliable tech support (our average response time is 30 minutes).
  • Access it from anywhere, and 24/7. All your files are backed up and secure in the Cloud.

Want to find out more about GoldFynch?