Instruction manual
Sharp MX-3501N
www.BERTL.com
tel. (1) 732-761-2311
fax. (1) 732-761-2312
info@BERTL.com
Copyright © 2006 MCA Internet, LLC dba BERTL 2-May-06
All Rights Reserved. The license under which this document is made available and applicable law prohibit any
reproduction or further transmission of any portion of this document. This document may only be viewed elec-
tronically through the www.BERTL.com Web site and may not be stored in electronic or hard copy format. Any
reproduction of trademarks is strictly prohibited. BERTL accepts no responsibility for any inaccuracies or
omissions contained in this document.
Page 37
Color at Work
®
Scan
Scan Data Capture Accuracy
One of the fastest growing needs for
high speed scanning is the conversion
of legacy hard copy documents into an
electronic format for information sharing,
reduced storage space demands, and
easier search and data retrieval.
A scan only converts a page into an
image which is not very manageable.
For that reason, most companies use
optical character recognition (OCR)
software to convert the images into
editable text which can then be
searched, changed, incorporated into
new document, etc., as required.
The OCR engine recognizes individual
images on the page converting them
into letters, numbers, and other
symbols. The OCR engine then runs
complex analysis on the text in
conjunction with spellcheckers, technical
dictionaries, and other data sources
before offering up its best conversion into
electronic format.
This stage can be very time-consuming,
especially if the quality of the scanned data
is poor, leading to character recognition
errors.
To look into this important workflow issue
BERTL runs a series of standard test
patterns with multiple font types, sizes and
colors through the device capturing the
data at various resolutions using both text
and text/photo settings. Text is the default
setting for most OCR work due to its 2-bit
format which tends to produce the best
text reproduction. However, as more and
more documents incorporate images and
colored elements, text/photo — which
operates in 8 grey shades for better
reproduction of images and colored text
elements — is also being used.
Having scanned each page of its test
originals, BERTL analysts then run the
scanned files through ABBYY FineReader
8.0, in default configuration. The impact of
the accuracy of the scanning process at
the various resolutions and settings is
reflected in the number of manual
confirmations that the OCR application
demands before the OCRed document is
deemed clean and ready to use.
The higher the human intervention rate,
the higher the cost to the company of
carrying out the action.
As expected, the greatest difficulty in OCR
recognition was found on the smallest 4
point text sections of the test documents.
While the majority of documents may
standardize on 8 or 10 point text for which
the device fared very well, 4 point text will
be processed on diagram labels, terms
and conditions on contracts, etc.
The choice of OCR application will also
Portion of BERTL OCR test chart scanned at 200 dpi
(top), 300 dpi (middle) and 600 dpi (bottom) in text for-
mat and saved as a PDF file. Image has been zoomed
to 400 percent in Adobe Acrobat before screen cap-
tured for display. Top line is 4 point, middle is 6 point