14.0
Table Of Contents
- Legal Notices
- Contents
- Welcome
- Installation and setup
- Introduction
- Processing documents
- Quick Start Guide
- Processing overview
- Automatic processing
- Manual processing
- Combined processing
- Processing with workflows
- Processing from other applications
- Processing with the Batch Manager
- Defining the source of page images
- Describing the layout of the document
- Zones and backgrounds
- Table grids in the image
- Using zone templates
- Proofing and editing
- Saving and exporting
- Workflows
- Technical information
- Index
20 Introduction
What is optical character recognition
Optical character recognition is the process of extracting text from an
image. This image can result from scanning a paper document or
opening an electronic image file. Images do not have editable text
characters; they have many tiny dots (pixels) that together form character
shapes. These present a picture of the text on a page.
During OCR, OmniPage Pro analyzes the character shapes in an image
and defines solutions to produce editable text. After OCR, you can save
the resulting text to a variety of word-processing, desktop publishing or
spreadsheet applications.
OmniPage Pro’s OCR capabilities
In addition to text recognition, OmniPage Pro can retain the following
elements of a document through the OCR process.
Graphics
Photos, logos, and drawings are examples of graphics.
Text formatting
Font types, sizes and styles (such as bold, italic and underlines
) are
examples of character formatting. Indents, tabs, margins and line spacing
are examples of paragraph formatting.
Page formatting
Column structure, table formats, and placement of graphics and headings
are examples of page formatting.
The graphics, text and page formatting elements that OmniPage Pro
retains are determined by the settings you select. Refer to the Settings
Guidelines in the online Help for more information about selecting
settings.
OmniPage Pro only recognizes machine-generated characters such as offset or laser-
printed or typewritten text. However, it can retain handwritten text, such as a
signature, as a graphic.