User guide

Chapter 5, Creating Clients, Projects, Custodians, and Jobs
5-24 Ipro eCapture User Guide www.iprotech.com
Q1 2014 877-324-4776
In most cases, MD5 hash values are calculated on the file itself. For
more reliable de-duplication of emails though, it is required that de-
duplication occur on the information contained within it and not the file
itself. There are many reasons for this; the simplest is the fact that when
an email is saved out of its container (PST, NSF, etc) the file that is cre-
ated contains information that would change the hash value of the same
email each time that the email was saved out.
When an email is discovered within Ipro eCapture, it is assigned a hash
value based on fields chosen by the user. The values of these fields are
concatenated and the text is hashed. Select from the following email
fields to generate the hash value:
•Subject
•From/Author
Attachment Count
Body: From the Body Whitespace drop-down list, select
either Include (default) or Remove. Whitespace in the e-mail
body could cause slight differences between the same e-
mails, which could result in different hashes being gener-
ated. Remove - removes all whitespace between lines of text
in the e-mail body prior to hashing. Include - keeps the
whitespace.
•E-mail Date: The following message types use the specified
date values: Outlook: Sent Date, Lotus Notes: Posted Date,
RFC822: Date, and GroupWise: Delivered Date. See the sec-
tion How Ipro eCapture Handles Dates - Time Zones on page
5-20 for additional information.
•Attachment Names
•Recipients
•CC
•BCC
Select from either Creation Date or Last Modification
Date. The selected value will be used when calculating the
MD5 hash in the event that the normal E-mail Date value is
not present. This commonly occurs for Draft messages that
have not been sent.