The Smart Way to Free Up eDiscovery Storage Space

02 November 2020 by Anith ediscovery storage

Takeaway: System files, duplicate files, and container files can waste case space. You could try to find and delete them manually, but there’s a better option: Find eDiscovery software that does it for you automatically, instead.

eDiscovery storage space has become so much more affordable with the Cloud.

Cloud-hosting has been around for a long time. In fact, if you use Dropbox, Google Drive, or any Apple product, then you’re already in the Cloud. Software giants like Amazon and Google lease storage and computing power to businesses all over the world. And their thousands of interconnected servers form a public ‘Cloud.’ The more people there are in the Cloud, the less the costs for each. So you pay much less than if you store your data on private servers.

With affordable storage, you’re not rushing to delete case files. And this leads to better eDiscovery.

eDiscovery has a flow of its own and can be quite an intuitive process. You want to respect this by giving yourself time to explore your data thoroughly. But if storage is expensive, you’ll delete files quickly just so you can downsize your case. That’s why Cloud eDiscovery is so important. It gives you the luxury of keeping files around until you know for sure you won’t need them.

But what about when you’re finally ready to reclaim case space? Well, here are the 3 types of files you can cull.

These are the safest files to delete, but don’t delete them just yet. We’ll explore how best to do that in the next section.

1. System files

To run smoothly, computers and applications need a range of specialized files. But these files don’t have any reviewable content for us users. For example, each piece of hardware we attach to a computer (printers, keyboards, etc.) comes with its own ‘driver’ files. Then there are ‘executable’ (i.e., EXE) files that launch applications, and DLLs, INIs, CHMs, etc., that run the applications. There are (literally) millions of these ‘system’ file types, but they weren’t created by your custodians, they aren’t connected to your case, and they don’t have any readable material. So, there’s no point in having them clutter your case and distract you.

2. Duplicate files

Say your client’s head of HR – Emily – emails a policy-change document to everyone in the organization. You’ll find the same document on her computer, the computers of other employees, and attached to emails sitting in inboxes, servers, and backup drives. They were important to everyone involved, at one point. But not anymore. So, they’re another category of files you can get rid of.

3. Container files

Container files like RAR, ZIP and PST (for email) are convenient ways of bundling a bunch of files together and compressing them. This way, they take up less space. But once the files have been extracted, you won’t need the original containers anymore. Here are some common types of containers:

  • ZIP files: This was the original file format used to compress files. It was developed in 1989 and we still use it today. The best thing about ZIPs is that many third-party applications can open them, and they’re built into Windows and Mac OS.
  • RAR: Roshal Archive Compressed files are a more efficient version of ZIPs, and are what we use most nowadays. The only hangup is that only WinRAR software can compress and decompress the RAR files. So that boxes you in a bit.
  • PST: Personal Storage Table files are a special file type created by Microsoft for their Outlook email app. When you use Outlook, it stores all your emails, attachments, calendar events, etc. in a PST file. So, think of it as more like a folder. (In fact, Microsoft calls them personal folders).
  • MBOX: Mailbox files are like PSTs, but with some differences. For one, you can open them on non-Microsoft applications like Apple Mail and Mozilla Thunderbird. Also, unlike PSTs, an MBOX is a text file. So you can open it with text editors like Microsoft Word and Notepad. It stores all your emails, in sequence – with a separating line between each of them.

So, what’s the best way to cull these sorts of files? Find eDiscovery software that does the work for you.

You could manually search for some of these files and delete them. But it’ll take ages and you’ll likely miss many along the way. Instead, find eDiscovery software that does the work for you.

1. It’ll use ‘hashing’ to catch duplicates

Your software will use an algorithm to give a unique ‘hash’ number to each file you upload. This number is like a digital fingerprint. So, two ‘exact’ duplicates will get the same hash number. Your software registers this and asks you what you want to do with the duplicate. Being able to detect duplicates in this manner can be extremely useful for keeping your cases clean and lean, allowing you to quickly and easily remove documents that are uploaded multiple times or which have many copies in a single bulk upload (e.g. the same attachment file being sent up and down in an email thread that gets added to your case in an uploaded pst file.)

2. It’ll deNIST your case to find system files

System files have ‘hash’ values, too. And organizations like the National Institute of Standards and Technology (NIST) collect and store hash values for system files. So, your eDiscovery application will compare the hashes of your case files to those on the NIST list. And it will flag the ones that match. (The process is called deNISTing).

3. It’ll track and pull up extracted container files

How do you track whether a container files’ data has been extracted or not? Well, you won’t need to. The best eDiscovery applications keep track of these extractions and pull up ‘deletable’ container files automatically. For example, eDiscovery software GoldFynch has a ‘Reclaim Case Space’ option that will bring up a list of the different types of containers in your case. You’ll get a description for each type (ZIP, RAR, PST, etc.) and how much space it takes up. Choose the ones that need to go and click the ‘Cleanup’ button.

Note: Make sure your eDiscovery software keeps extracted files linked to their ‘file families.’

Many files in your case will be associated with each other. And these are called ‘file families’, where the primary ‘parent’ file has a bunch of ‘child’ files. For example, an email and its attachments are a file family. And so is a PowerPoint document and its embedded video clips. File families with container files can complicate eDiscovery unless your software is designed to keep them linked. For example: Imagine that your client asks a colleague to email her the details of a new policy change. The colleague sends the email with the subject ‘The document you requested’, body text saying ‘Here you go’, and an attached RAR file named ‘December policy change’. When extracting the RAR file, you want its contents to stay associated with the ‘parent’ email even after the RAR container has been deleted. Otherwise a search for ‘policy change’ will catch the RAR file but not the extracted contents or the email they came from. Not all eDiscovery applications can handle this, so make sure yours can.

Looking for eDiscovery software that can help you reclaim case space? Try GoldFynch.

It’s an eDiscovery service that’s perfect for smaller, in-house cases.

  • It costs just $27 a month for a 3 GB case: That’s significantly less than most comparable software. With GoldFynch, you know what you’re paying for exactly – its pricing is simple and readily available on the website.
  • It’s easy to budget for. GoldFynch charges only for storage (processing is free). So, choose from a range of plans (3 GB to 150+ GB) and know up front how much you’ll be paying. It takes just a few clicks to move from one plan to another, and billing is prorated – so you’ll pay only for the time you spend on any given plan. With legacy software, pricing is much less predictable.
  • It takes just minutes to get going. GoldFynch runs in the Cloud, so you use it through your web browser (Google Chrome recommended). No installation. No sales calls or emails. Plus, you get a free trial case (0.5 GB of data and processing cap of 1 GB), without adding a credit card.
  • It’s simple to use. Many eDiscovery applications take hours to master. GoldFynch takes minutes. It handles a lot of complex processing in the background, but what you see is minimal and intuitive. Just drag-and-drop your files into GoldFynch and you’re good to go. Plus, it’s designed, developed, and run by the same team. So you get prompt and reliable tech support.
  • It keeps you flexible. To build a defensible case, you need to be able to add and delete files freely. Many applications charge to process each file you upload, so you’ll be reluctant to let your case organically shrink and grow. And this stifles you. With GoldFynch, you get unlimited processing for free. So, on a 3 GB plan, you could add and delete 5 GB of data at no extra cost – as long as there’s only 3 GB in your case at any point. And if you do cross 3 GB, your plan upgrades automatically and you’ll be charged for only the time spent on each plan. That’s the beauty of prorated pricing.
  • Access it from anywhere. And 24/7. All your files are backed up and secure in the Cloud.