User guide

www.iprotech.com Ipro eCapture User Guide F-3
877-324-4776 Q1 2014
Database: A collection or records. Each record contains a set of fields. Each
field contains a unit of information. For example, a residential telephone
directory (the database) contains a collection of records (the residential
listings) that contain fields (name, address, and phone number of each
resident).
De-duplication Reports: Two types are available - Summary (summary report
of items not processed due to de-duplication settings) and Detailed (list of
errors and status messages encountered during discovery).
De-duplication: A process of identifying and separating identical electronic
documents. In Ipro eCapture, the MD5 hash value of each document is
generated during the discovery phase. When de-duplication is performed, a
look-up for the same MD5 hash is performed across the specified de-
duplication scope (Current Job, Custodian, Project and Client) for all
previously-processed data. If a match is found, the item is marked a duplicate;
if not, it is marked an original. Additional scope options within Ipro eCapture
allow families of documents to be maintained through de-duplication such that
if the top-level parent document is marked a duplicate, the entire family is
marked as duplicates. Alternatively, items within a family can be de-duplicated
individually. Only items selected for processing can be eligible for de-
duplication, and only non-filtered (i.e. processed) items are marked as an
original. If two items have matching MD5 hashes, the SHA-1 hash value is
checked as well. If those values still match and the documents are parents, a
family hash is generated by hashing the concatenated MD5 hash values of the
entire family. This allows for a through hash comparison for the entire family in
the event of differences between child documents. Bit-by-bit comparisons
between files can also be performed during de-duplication, and matching file
names can also be made a requirement for de-duplication.
DeletionHistory.LOG file: Located in the path of ...eCapture\Controller, this file
records all activities conducted by the Deletion Agent.
Discovery Job: In Ipro eCapture, a single directory is chosen to run the
discovery job from in order to determine file types. During the discovery
process, the MD5 hash for files (sans container files) are calculated and
indexing occurs.
Discovery Reports: Two types are available - Summary (summary of items
discovered by item type and category) and Detailed (list of all items
discovered).