Importing productions - guidelines


When receiving data from other parties, sometimes there is an opportunity to specify or negotiate the desired production format. In these instances, we recommend requesting data in as near-to-native a format as possible, in order to preserve as much of the original file metadata and forensic information as possible. This document further outlines the ideal incoming production format (including load file fields where necessary, in detail,) and may be sent to other parties as a production format specification.

General file naming and numbering

Each produced file should be assigned a single Bates number, and named with that single number (including any prefix.) Attachment and children files

NOTE: This isn't valid in cases of redaction - see below for more on this

For redacted and other non-native files

For redacted and other non-native files, common in cases where it is not possible or practical to produce a file in its native format, the file should be rasterized or rendered and should be produced as a single PDF file with a searchable text layer.

Example: Consider a MSG email which has a single ZIP attachment. Ideally, the single MSG file would get assigned a single Bates number, e.g. FILE_0001, renamed to FILE_0001.msg, and the ZIP attachment would not be produced separately. In the case that the MSG file needs redactions, it would be rendered to a PDF file, named FILE_0001.pdf, and the ZIP attachment would then be assigned Bates FILE_0002 and have it’s “PARENT_ID” column set to “FILE_0001.”

For emails

The first preference for email productions is bulk export/archive files like PST or OST for Outlook/Exchange systems, and MBOX for Gmail or other mail services.

If bulk email production in PST or MBOX format is not possible, the next preference is for near-native individual MSG or EML (MIME) files.

For other electronic documents and files

Other digital documents and files should be produced in native format, as found on the originating filesystem where possible, especially any files that represent:

Where possible, electronic files should populate:

Additionally, when available, files originating from Apple operating systems should populate:

For paper documents converted into electronic documents

Paper documents should be:

PDFs generated from scanned documents should:

Load files and additional production formatting

The production should:

  1. consist of native files and generated PDF files
  2. be named according to their assigned Bates numbers
  3. be placed in a folder named “NATIVES,” which may consist of numbered subdirectories

The load file itself should:

NOTE: In the case of JSON, the load file should be structured as an array, with one JSON object / key-value-map per produced file

Load file fields

Refer to the following table for load file fields, descriptions and examples:

Column Name Description Example
DOC_ID The Bates number (with prefix) of the file. FILE_0001
PARENT_ID The Bates number of the parent file, in the case that individual files of a family are produced individually due to redactions. FILE_0001
NATIVE_PATH The path to the native file, or to the derived/rendered PDF file. It should be relative to the top folder of the production. NATIVES/0001/FILE_0001.msg
TRUE_NATIVE Indicates whether the file is truly a native file or is a derived PDF. T
CUSTODIAN Description of who/where the file originated. John Doe
MAILBOX_FOLDER Mailbox folder for individual email files Inbox/Invoices/2018
OS Name of the operating system where the file originated. Windows 10
FILESYSTEM Name of the disk filesystem where the file originated. NTFS
FS_CREATED Created date & time from the original filesystem. 2017-02-22T16:24:36Z
FS_MODIFIED Modified date & time from the original filesystem. 2017-02-22T16:24:36Z
FS_ACCESSED Accessed date & time from the original filesystem. 2017-02-22T16:24:36Z
APPLE_WHEREFROM “com.apple.metadata:kMDItemWhereFroms” field populated from Apple Finder metadata. https://dl-web.dropbox.com/get/file.pdf, https://www.dropbox.com/
APPLE_QUARANTINE “com.apple.quarantine” field populated from Apple Finder metadata. 0001;55555555;Google Chrome;
ORIG_EXT For redacted files, the extension of the original, native file. .msg
ORIG_TYPE For redacted files, the MIME filetype of the original, native file. application/vnd.ms-outlook
CREATED For redacted files, the internally-created metadata date from the original, native file. 2017-02-22T16:24:36Z
MODIFIED For redacted files, the internally-created metadata date from the original, native file. 2017-02-22T16:24:36Z
AUTHOR For redacted files, the internally-created metadata date from the original, native file. Jane Doe
SUBJECT For redacted emails, the subject from the original, native file. Fwd: Some Subject
FROM For redacted emails, the “from” field from the original, native file. Jane Doe jane@doe.com
TO For redacted emails, the “to” field from the original, native file. John Doe john@doe.com; Jane Doe jane@doe.com
CC For redacted emails, the “cc” field from the original, native file. John Doe john@doe.com; Jane Doe jane@doe.com
BCC For redacted emails, the “bcc” field from the original, native file. John Doe john@doe.com; Jane Doe jane@doe.com
SENT For redacted emails, the “date” header or PidTagClientSubmitTime from the original, native file. 2017-02-22T16:24:36Z
RECEIVED For redacted emails, the latest “received-by date or PidTagMessageDeliveryTime from the original, native file. 2017-02-22T16:24:36Z
MESSAGE-ID For redacted emails, the “message-id” header or PidTagInternetMessageId from the original, native file. email1@sample.com
REFERENCES For redacted emails, the “references” header or PidTagInternetReferences from the original, native file. email1@sample.com, email2@sample.com
HEADERS For redacted emails, the entire header section or PidTagTransportMessageHeaders from the original, native file. Received-By: …
From: …
MSG_CLASS For redacted MSG files, the PidTagMessageClass from the original, native file. IPM.Note