User Guide
Form Types
OCR/ICR basics
There are two major types of character recognition Optical
Character Recognition (OCR) and Intelligent Character
Recognition (ICR). OCR programs recognize characters printed
using a printer, a plotter or a typewriter. ICR programs read docu
ments filled in by hand in block letters (socalled handprint recog
nition). Let us consider the main differences between OCR
programs and ICR programs.
An OCR program first analyses the image and divides it into
zones which include text, tables, illustrations, etc. Next, it divides
these zones into smaller objects: paragraphs, lines, words, and char
acters. Once the characters have been recognized by the character
classifiers, the OCR program will assemble them back into words,
lines, paragraphs, etc., until it gets an electronic version of the orig
inal paper document.
ICR programs, which are mainly used to process handfilled
forms, work differently. First, an ICR program detects zones that are
expected to contain meaningful data entered by the user. These
zones are then processed by the program's modules, including the
character classifiers. ICR programs do not attempt to recreate the
original document. Instead, they are extracting information from
particular fields and save it into a database.
An important feature of an ICR program is mark sense recog
nition, or recognition of marks in check boxes. Check boxes are
widely used on all sorts of forms, because they make their comple
tion easier and can increase the reliability of output data up to
99.9%. ABBYY FormReader 6.0 can recognize all sorts of marks.
Mark sense recognition is usually referred to as OMR (Optical Mark
Recognition) and works as follows: when creating a template, the
operator singles out a checkbox zone where the program has to
look for a mark; the program then analyses these zones on com
pleted forms and calculates the black/white ratio in these areas. If
the portion of black colour in a check box exceeds a certain thresh
old, FormReader will consider the check box selected. FormReader
can even recognize corrected marks, i.e. boxes ticked by mistake
and then inked over.
ABBYY FormReader 6.0 will reliably recognize not only conven
tional ticks/checks and crosses, but also completely inked over
check boxes if the latter are rectangular in shape or have no borders.
This feature of ABBYY FormReader has a very important prac
tical application. Suppose someone filling in a form makes a mis
take and ticks the wrong box. Instead of taking a new blank form
and filling it from scratch, they can just blot out the mark in the
check box selected by mistake and put a new mark in the right
check box. FormReader will treat the inkedover check box as a
mistake and consider it to be unchecked. This method may also be
used when recognizing text fields.
Verification of inkedover check boxes in ABBYY FormReader Desktop Edition.