De-Duplication

GoldFynch's de-duplication function helps identify whether there are multiple copies of the same file present in a case, and lets you flag such files with a special "DUPE" tag.

You can use different Strategies (modes) to compare the files, and you can also select the Scope of the comparison to either your entire case or specific folders. Once the selections are made, an initial evaluation is executed and detailed statistics are displayed based on your Strategy and Scope.

The system then lets you run the de-dupe process to add the special "DUPE" system tag to all duplicate files.

De-Duplication Scope

When a de-duplication operation is run, all duplicate documents are collected into groups and within these groups, one or more will be designated as a primary candidate and all the others will be duplicates.

The de-duplication process can be run on a specific group of files as shown below;

Transfer of tags

If any of the dupe files have tags, the system will attempt to transfer the tags from the dupe items to the primary along with any attached tag notes. If the primary item already has those tags, just the note will be appended/transferred. If multiple dupes have the same tag with different notes, all the notes will be appended and applied to the primary item.

De-Duplication Strategy

The method used to compare the files and identify the duplicates is known as the de-duplication strategy. The different strategies available on GoldFych are:

Running De-Duplication

Step 1. Navigate to the De-Dupe view by click on the button in the left pane

De-dupe View

Step 2. Click on the +New De-dupe Session button

Step 3. Enter a name for the de-dupe

Enter a name and click on create

Step 4. Click on the Create button

Step 5. Select the De-Dupe Scope

Select de-dupe scope and strategy

Note:

Select the folder for de-dupe

Step 6. Select the De-Dupe Strategy from the drop-down list

Step 7. Click on the Save and Evaluate button. Once the evaluation process is completed a report of the specified datasets along with information about the duplicates present in them will be displayed.

De-duplication statistics

Note:

Step 8. Click on the Apply.. button to run the final de-dupe process.

De-duplication statistics

Once the de-duplication process is complete you will see a confirmatory message at the top of your screen with the Scope and Strategy used.

Completed Deduplication

If a more recent de-dupe operation has been performed, the de-dupe session will indicate this instead.

Completed Old Deduplication

Warning Notifications

The system scans for conflicts within the selected file set(s) that may affect the de-dupe process and will display warnings in the following scenarios:

Also note that the total number of items that are tagged may be more than the total number of dupes found during the evaluation. This is because the tag will also be applied to all attachments of dupe items (which are not considered or counted during the de-dupe evaluation process.)

Save and Re-evaluate a De-Duplication Session

You can re-evaluate a de-duplicate session as long as it has not been applied. The steps to do so, are given below:

Step 1. Navigate to the De-Dupe view

Step 2. Click on the folder icon against the de-dupe session you want to re-evaluate

Click on the folder icon to view session

Step 3. Make any changes that you wish and click on the Save and Re-evaluate button

Save and Re-evaluate a dedupe session

Note:

Delete a De-Duplication Session

This will only delete the de-duplication sessions themselves and not delete any files. Doing so allows you to clear out older sessions that have been run in the past, or unused sessions.

Step 1. Navigate to the De-Dupe view

Step 2. Click on the delete icon against the de-dupe session you want to delete

Click on the delete icon to delete the dedupe session

Step 3. Click on the Delete button on the Delete De-Dupe Session screen overlay

Confirm Deletion

Reset case-wide Dupe files

Step 1. Navigate to the De-Dupe view

Step 2. Click on the Reset case-wide Dupe files button

Reset case-wide Dupes

Step 3. Click on the Yes, reset DUPEs button on the confirmation screen overlay

Confirmation for reset of Dupes

A message that the case-wide dupe reset was a success will be displayed on the screen once the process has been completed

Successful reset

Generate a report of the duplicates in your case

A report of the duplicate files in your case can be generated using the de-duplication function. To do so, first, create a New De-dupe session then-

Click here to know more about generating other reports in GoldFynch

Components of a duplicate file report

The components of the duplicate file report are -

In case the files are emails then the following fields will be populated with the available metadata -

Note: If the source does not have the appropriate metadata then these fields will be blank even if they are emails

Example of using the de-duplication tag with other GoldFynch systems: Omitting dupes in advanced searches

The de-duplication system can be used for much more than just deleting and tagging duplicate files in your case. Once the DUPE system tag has been applied to duplicate files, you can perform operations on them using GoldFynch's other systems like its advanced search engine, review sets, productions, reports, and more.

After running the de-dupe process and tagging duplicate files in your case, you can omit such dupes from the search results of any search you run. To do so, add the following section to your search query while constructing an advanced search:

If you are creating a query in the Advanced Search view then:

  1. After creating your query, add a condition using the AND operator to the outermost level of your search query
  2. Set the parameter to “system-tags,” the connector to “is-not,” and the value to “dupe”

Omit dupes in searches in the advanced search view

If you are typing your query into the search bar then add the following to the query:

AND system-tags != DUPE