LEGAL NOTICES Copyright © 2002 ScanSoft, Inc. All rights reserved. No part of this publication may be transmitted, transcribed, reproduced, stored in any retrieval system or translated into any language or computer language in any form or by any means, mechanical, electronic, magnetic, optical, chemical, manual, or otherwise, without prior written consent from ScanSoft, Inc., 9 Centennial Drive, Peabody, Massachusetts 01960. Printed in the United States of America and in the Netherlands.
C O N T E N T S WELCOME 1 2 7 Using this Guide 8 Getting online Help 9 Online HTML Help 9 Context-Sensitive Help 9 Tech Notes 10 Glossary 10 INSTALLATION AND SETUP 11 System requirements 12 Installing OmniPage Pro 13 Setting up your scanner with OmniPage Pro 14 How to start the program 16 Registering your software 17 New features in OmniPage Pro 12 17 INTRODUCTION 19 What is optical character recognition 20 OmniPage Pro’s OCR capabilities 20 Documents in OmniPage Pro 2
The Text Editor 24 The OmniPage Toolbox 25 Managing documents Thumbnails 26 Document Manager 27 Customizing Document Manager columns 28 Deleting pages from a document 28 Printing a document 29 Closing a document 29 OmniPage Documents 30 How to save to OPD 30 31 PROCESSING DOCUMENTS 33 Quick Start Guide 34 Loading and recognizing sample image files 34 Scanning and recognizing a single page 34 Processing overview 36 Automatic processing 38 Stopping and restarting automatic proc
4 Input from scanner 49 Scanning with an ADF 50 Scanning without an ADF 51 Describing the layout of the document 51 Zones and backgrounds 53 Automatic zoning 53 Manual zoning 54 Zone types and properties 55 Working with zones 57 Table grids in the image 59 Using zone templates 61 PROOFING AND EDITING 63 The editor display and views 64 Proofreading OCR results 65 Verifying text 66 User dictionaries 68 Training 69 Manual training 69 IntelliTrain 70 Training files 71 Te
5 6 SAVING AND EXPORTING 77 Saving original images 78 Saving recognition results 79 Saving a document as you work 80 Selecting a formatting level 81 Selecting advanced saving options 82 Saving to PDF 84 Copying pages to Clipboard 84 Sending pages by mail 85 TECHNICAL INFORMATION Troubleshooting 87 88 Solutions to try first 88 Testing OmniPage Pro 89 Increasing memory resources 90 Increasing disk space 90 Text does not get recognized properly 91 Problems with fax recognition
Welcome Welcome to OmniPage Pro®, and thank you for using our software! The following documentation has been provided to help you get started and give you an overview of the program. This User’s Guide This guide introduces you to using OmniPage Pro 12. It includes installation and setup instructions, a description of the program’s commands and working areas, task-oriented instructions, ways to customize and control processing, and technical information.
Using this Guide This guide is written with the assumption that you know how to work in the Microsoft Windows environment. Please refer to your Windows documentation if you have questions about how to use dialog boxes, menu commands, scroll bars, drag and drop functionality, shortcut menus, and so on. We also assume you are familiar with your scanner and its supporting software, and that the scanner is installed and working correctly before it is setup with OmniPage Pro 12.
Getting online Help In addition to using this guide, you can use OmniPage Pro’s online Help to learn about features, settings, and procedures. Online Help is available after you install OmniPage Pro. Online HTML Help Open OmniPage Pro’s online Help at its top level by choosing Help Topics at the top of the Help menu. This allows you to see topics arranged in a Table of Contents, search an alphabetical list of keywords or make full-text searches through the topics.
Tech Notes ScanSoft’s web site at www.scansoft.com contains Tech Notes on commonly reported issues using OmniPage Pro 12. Web pages may also offer assistance on the installation process and troubleshooting. Glossary This guide does not include a glossary. The online Help has a comprehensive glossary, with its own alphabetical index and a table of contents. Please consult it if you want to find the meaning of a term used in this guide or in the program.
Chapter 1 Installation and setup This chapter provides information on installing and starting OmniPage Pro 12.
System requirements You need the following minimum system requirements to install and run OmniPage Pro 12: X A computer with a Pentium or higher processor X Microsoft Windows 98 (from second edition), Windows Me, Windows NT 4.
Chapter 1 Installing OmniPage Pro OmniPage Pro 12’s installation program takes you through installation with instructions on every screen. Before installing OmniPage Pro: X Close all other applications, especially anti-virus programs. X Log into your computer with administrator privileges if you are installing on Windows NT, 2000 or XP.
Setting up your scanner with OmniPage Pro All files needed for scanner setup and support are copied automatically during the program’s installation. Before using OmniPage Pro 12 for scanning, your scanner should be installed with its own scanner driver software and tested for correct functionality. Scanner driver software is not included with OmniPage Pro. Scanner installation and setup are done through the Scanner Wizard. You can start this yourself, as described below.
Chapter 1 X Click on Scan to begin the sample scan. X If necessary, click on Inverse Image… or Missing Image… and X X X X X X X X X X make the appropriate selections. Once the image appears correctly in the window, click on Next. Select the item that most appropriately describes your scanner, then click on Next. Click on Next to proceed to page size. The page sizes that the Scanner Wizard believes your scanner to support are listed in the window.
How to start the program To start OmniPage Pro 12 do one of the following: X Click Start in the Windows taskbar and choose Programs ScanSoft OmniPage Pro 12.0 OmniPage Pro 12.0. X Double-click the OmniPage Pro icon in the program’s installation folder or on the Windows desktop if you placed it there. X Double-click an OmniPage Document (OPD) icon or file name; the clicked document is loaded into the program. See “OmniPage Documents” on page 29.
Chapter 1 Registering your software ScanSoft’s registration Wizard runs at the end of installation. We provide an easy electronic form that can be completed in less than five minutes. When the form is filled, and you click Send the program will search an Internet connection to immediately perform the registration online. If you did not register the software during installation, you will be periodically invited to register later. You can go to www.scansoft.com to register online.
saved to zone templates. See page 53. Irregular zones can be drawn and zones split and joined more simply, without the need for separate tools. See page 57. X Better proofing and verifying The Proofing dialog box now shows suspect words in a wider context. A dynamic verifier can stay open as text is being checked, with the image display and window tracking the editing position. See page 65. X Formatting levels for display and saving There are three formatting levels for Text Editor display. See page 64.
Chapter 2 Introduction You probably use your computer for business correspondence, preparing reports, handling data and an ever-increasing number of other uses. The challenge is that, in spite of the digital revolution, certain sources of information still circulate in printed, paper form and cannot be used immediately in a computer.
What is optical character recognition Optical character recognition is the process of extracting text from an image. This image can result from scanning a paper document or opening an electronic image file. Images do not have editable text characters; they have many tiny dots (pixels) that together form character shapes. These present a picture of the text on a page. During OCR, OmniPage Pro 12 analyzes the character shapes in an image and defines solutions to produce editable text.
Chapter 2 Documents in OmniPage Pro OmniPage Pro 12 handles documents one at a time. When you acquire your first image (from scanner or from file) a new document is started. Further acquired images are added to the same document, until you save and close it. A document in OmniPage Pro consists of one image for each document page. After you perform OCR, the document will also contain recognized text, displayed in the Text Editor, possibly along with graphics and tables.
The OmniPage Desktop The OmniPage Desktop has a title bar and a menu bar along the top and a status bar along the bottom. It has three main working areas, separated by splitters: the Document Manager, the Image Panel and the Text Editor. Each has close, maximize and restore buttons top right. The Image Panel has an Image toolbar and the Text Editor has a Formatting toolbar. Standard toolbar OmniPage Toolbox Formatting toolbar Thumbnails show a picture of each page in the document.
Chapter 2 We show the program with a three-page document. Page one is the current page, which has been recognized and proofed. Page two has been recognized but not proofed yet. Page three has been acquired and manually zoned, but not recognized yet. The icons at the bottom of the thumbnail images show page status. Status bar buttons let you show or hide the main screen areas and move to other pages in the document.
The Image Panel When this displays the current page image, the Image toolbar is available. All page images have a background value: process or ignore. Zones can be manually drawn on page images, or can be placed automatically after recognition. There are five zone types: Process, Ignore, Text, Table, Graphics. Areas inside process zones and on a process background outside other zones have zones automatically drawn and their zone types determined during processing. See “Zones and backgrounds” on page 53.
Chapter 2 The OmniPage Toolbox This Toolbox lets you drive the processing. By default it is located along the top of the OmniPage Desktop, just above the working areas. It can be floated and also be docked along the bottom of the desktop. Start button Get Pages drop-down list Get Page button Perform OCR button Layout Description drop-down list Export Results button Export Results drop-down list Automatic processing is started, and can be stopped and re-started with the Start (1-2-3) button.
Managing documents Document management can be done by thumbnails in the Image Panel or by the Document Manager, situated along the bottom of the OmniPage Desktop. Both summarize the pages in the document and are synchronized. Our pictures show these with the same seven-page document. Pages 1 and 2 are selected and page 4 is the current page, that is, the one shown in the Image Panel. Page status is shown as follows: Page Status Icon Page image has been...
Chapter 2 the Ctrl key as you click thumbnails to add pages to a selection one by one. Then you can move or delete the selected pages as a group, or send them to (re)recognition. You can also export selected pages. Get information on an input image by hovering the cursor over its thumbnail (so long as ToolTips are enabled). A popup text displays the image size in pixels and the program’s unit of measurement. Image resolution is also shown.
When multiple pages are being selected, the page set as current does not change. All selected pages are highlighted. Customizing Document Manager columns You can specify which columns of information you want to see in the Document Manager. Click Customize Columns... in the View menu for the following dialog box: This item is highlighted. Click a checkbox to select the item. Image sizes are expressed in pixels. Highlight an item and use these arrows to change the order of columns.
Chapter 2 Printing a document You can print the document with the Print item in the File menu. Choose whether to print images or text (that is, recognition results as they appear in the Text Editor). You can print all pages or a range of pages. The Print tool in the Standard toolbar prints images or text, depending whether the Image Panel or the Text Editor is active. Closing a document Choose Close in the File menu to close a document.
Why save to OPD You do not have to save your documents to the OPD file type. You would typically do this for the following reasons: x You cannot finish working with the document in the current session. x You want to pass the document to other users who have OmniPage Pro. For example, you can pass an OPD file to a specialist for proofing. In an office network, you may have one scanner generating images for recognition and proofing at several workstations.
Chapter 2 Settings The Options dialog box is the central location for OmniPage Pro settings. Access it from the Standard toolbar or the Tools menu. Contextsensitive help provides information on each setting. In overview, the settings panels are: OCR Use this to specify recognition languages, a user or professional dictionary, a reject character and font matching. Click the checkbox before a language to select or deselect it.
Proofing Use this to define whether proofreading should begin automatically after recognition. Define also whether IntelliTrain should run, and use it to load or work with a training file. See “Proofreading OCR results” on page 65. Custom Layout Use this to describe the layout of your input document pages very precisely. This gives you maximum control over the auto-zoning process, instructing it to search or ignore columns, graphics and tables. See “Describing the layout of the document” on page 51.
Chapter 3 Processing documents This tutorial chapter describes different ways you can process a document and also provides information on key parts of this processing.
Quick Start Guide This topic takes you step-by-step through the basic OCR process. Loading and recognizing sample image files You will find sample image files in the program folder, both single-page and multi-page files. First try reading these files using the procedure presented below, except for the references to a scanner. See “Input from image files” on page 48. The results provide you with a benchmark of the recognition quality you should expect from your own files of comparable quality.
Chapter 3 What you do: What happens: 1. Set up your scanner using the Scanner Wizard, if this is not already done. Configures OmniPage Pro to work with your scanner. 2. Select Start Programs ScanSoft OmniPage Pro 12.0 OmniPage Pro 12.0 3. Place the document correctly in your scanner. 4. From the Get Page drop-down list, select a scan option for your document: black-and-white, grayscale or color.
Processing overview The following flow diagram summarizes the processing steps: Get Pages from file page 48 from scanner page 49 Describe page layout page 51 Apply a template page 61 Autozoning page 53 Manual zoning page 54 Export pages Perform OCR with current settings page 31 Verify and edit page 66 Proofread page 65 to file page 79 to Clipboard page 84 via Mail page 85 Here is an overview of the processing methods you can use.
Chapter 3 Using the OCR Wizard The OCR Wizard guides you through the selection of settings and commands by asking you questions. It then launches automatic processing. This is a good way to get started if you are new to OmniPage Pro. In other applications You can use the Direct OCR feature to call on the recognition services of OmniPage Pro while working in your usual word-processor or similar application.
Automatic processing Automatic processing provides an efficient way of handling documents, especially larger ones. First you select all settings needed, then you can use the Start button in the OmniPage Toolbox to process a new document from start to finish or to restart and finish processing on an open document. Start button Get Page button Get Pages drop-down list Perform OCR button Export Results button Export Results drop-down list Layout Description drop-down list 1.
Chapter 3 4. Choose in the Standard toolbar or Options in the Tools menu and check that settings are appropriate for your document. You can, for instance, specify recognition languages and whether you want to proofread the document or not. See “Settings” on page 31. 5. Click the Start button or choose Start auto-processing in the Process menu. Each page of the document is processed and finished one after the other.
Manual processing Manual processing gives you more precise control over the way your pages are handled. You can process the document page-by-page with different settings for each page. The program also stops between each step: acquiring images, performing recognition, exporting. This lets you, for instance, change the page background and draw zones manually on each page. You start each step in the process by clicking the three numbered buttons on the OmniPage Toolbox. 1.
Chapter 3 6. Select a value for the Perform OCR button. You describe the layout of the incoming pages. This value has an influence if auto-zoning runs on any pages. See “Describing the layout of the document” on page 51. You can also select a template to have its zones placed on the current page. See “Using zone templates” on page 61. 7. Click the Perform OCR button to have the current page recognized.
determine which pages are in order, and which need different settings or some manual zoning. After adjusting settings and/or modifying zones, use manual processing to re-recognize just those pages. 1. Prepare the document and perform automatic processing, as already described. 2. If you close or finish proofing you will be invited to save the document. This is recommended, even if it is not in its final form. 3. Select a page needing rezoning and delete or modify the existing zones in the Image Panel.
Chapter 3 3. Manually zone pages where you want to process only part of the page or if you want to give precise zoning instructions. Use ignore backgrounds or zones to exclude areas from processing. Use process backgrounds or zones to specify areas to be auto-zoned. 4. Click the Start button, then choose Finish Processing Existing Pages in the Automatic Processing dialog box. 5. After proofing (if requested) you can save or export the document.
5. The last panel asks you to define the export choice: saving to file or copying to Clipboard. After setting the choice, click Finish to close the Wizard and start the automatic processing. 6. If you requested proofing and the text contains suspect words, the OCR Proofreader dialog box will appear. When proofing is finished or closed, the Copy to Clipboard or Save As dialog box let you specify file export settings, including a page range and a formatting level. 7. The document remains in OmniPage Pro.
Chapter 3 How to set up Direct OCR 1. Start the application you want connected to OmniPage Pro. Start OmniPage Pro, open the Options dialog box at the Direct OCR panel and select Enable Direct OCR. 2. Select process options for proofing and zoning. These function for future Direct OCR work until you change them again; they are not applied when OmniPage Pro is used on its own. 3. The Unregistered panel displays running or previously registered applications. Select the desired one(s) and click Add.
If OmniPage Pro is running when Direct OCR is called from a target application, a second instance of OmniPage Pro is launched. See the Direct OCR topics in online Help for more information. These include a topic Direct OCR Questions and Answers. The Readme file and the ScanSoft web site may present more recent information relating to specific target applications. How to use OmniPage Pro with PaperPort PaperPort® is a paper management software product from ScanSoft.
Chapter 3 Processing with Schedule OCR You can schedule OCR jobs to be performed automatically at any time within the following eight days. The job pages can come from a scanner with an ADF or from image files. You do not have to be present at your computer at job start time, nor does OmniPage Pro have to be running. It does not matter if your computer is turned off after the job is set up, so long as it is running at job start time.
Defining the source of page images There are two possible image sources: from image files and from a scanner. There are two main types of scanners: flatbed or sheetfed. A scanner may have a built-in or added Automatic Document Feeder (ADF), which makes it easier to scan multi-page documents. The images from scanned documents can be input directly into OmniPage Pro or may be saved with the scanner’s own software to an image file, which OmniPage Pro can later open.
Chapter 3 Normally the Add button places each file at the bottom of the file list. To place a file at a different location, highlight a file in the list. The new file will be added immediately below the lowest highlighted file. Input from scanner You must have a functioning, supported scanner correctly installed with OmniPage Pro. See “Setting up your scanner with OmniPage Pro” on page 14. You have a choice of scanning modes.
Brightness and contrast Good brightness and contrast settings play an important role in OCR accuracy. Set these in the Scanner panel of the Options dialog box or in your scanner’s interface. The diagram illustrates an optimum brightness setting. After loading an image, check its appearance. If characters are thick and touching, lighten the brightness. If characters are thin and broken, darken it. Then rescan the page.
Chapter 3 You can scan double-sided documents with an ADF. A duplex scanner will manage this automatically. For non-duplex scanners, select Scan double-sided pages in the Scanner panel of the Options dialog box. Then you can scan the document in just a few passes, with even pages grouped together and odd pages also grouped. OmniPage Pro will merge the pages for you. Scanning without an ADF You can scan multi-page documents efficiently from a flatbed scanner, even without an ADF.
Single column, no table Choose this setting if your pages contain only one column of text and no table. Business letters or pages from a book are normally like this. Choose it also for a page with words or numbers arranged in columns if you do not want these placed in a table or decolumnized or treated as separate columns. Graphics may be detected.
Chapter 3 Zones and backgrounds Zones define areas on the page to be processed or ignored. Zones are rectangular or irregular, with vertical and horizontal sides. Page images in a document have a background value: process or ignore (the latter is more typical). Background values can be changed with the tools shown.
Auto-zone a page background Acquire a page. It appears with a process background. Draw a zone. The background changes to ignore. Draw text, table or graphic zones to enclose areas you want manually zoned. Click the Process background tool (shown) to set a process background. Draw ignore zones over parts of the page you do not need. After recognition the page will return with an ignore background and new zones round all elements found on the background.
Chapter 3 No. Type What happens: 1 Text zone OCR runs and generates text. 2 Table zone OCR runs, text is placed in a table grid. 3 Graphic zone Image is embedded in recognized page. 4 Process zone 5 Process background Auto-zoning creates one or more zones, decides their types and processes their contents.
process zones on an ignore background. Draw a process zone to enclose columns of text to have them handled automatically. They will be decolumnized in the Text Editor’s NF view and RFP view, but kept in columns in True Page view. Ignore zone (gray) Use this to draw an ignore zone, to define a page area you do not want transferred to the Text Editor. Auto-zoning will not place zones here. To exclude a given page area from many pages (for example a header or page numbers), place an ignore zone in a template.
Chapter 3 Working with zones The Image toolbar provides zone editing tools. One is always selected. When you no longer want the service of a tool, click a different tool. Some tools on this toolbar are grouped. Only the last selected tool from the group is visible. To select a visible tool, click it. To select a hidden tool, hold down the mouse button on the triangle at the bottom right of the visible tool until the additional tools appear, then click the tool you want.
Join two zones of the same type Draw an overlapping zone of the same type. existing zones new zone resulting zone Make an irregular zone by subtraction Draw an overlapping zone of the same type as the background (in this example, on an ignore background). existing zone on an ignore background resulting zone new ignore zone Split a zone Draw a splitting zone of the same type as the background (in this example, on a process background).
Chapter 3 The following zone shapes are prohibited: Indented along the bottom Indented along the top Hole in the middle To expand a zone more quickly than using its resizing handles, draw a zone of the same type to completely enclose it. The smaller zone is replaced by the larger one. To replace a set of zones of whatever type with a single zone, draw a larger zone of the desired type to completely enclose them. All the smaller zones are replaced by the larger one.
Use the table tools and their cursors as follows: Insert row dividers Click the tool then click at the location in a table zone where you want to place a row divider. Avoid placing a divider so it cuts through text. Insert column dividers Click the tool then click at the location in a table zone where you want to place a column divider. Move dividers Click the tool and move the cursor to the row or column divider to be moved. It displays a double-headed arrow. Drag the divider as desired.
Chapter 3 Using zone templates A template contains a page background value and a set of zones and their properties, stored in a file. A zone template file can be loaded to have template zones used during recognition. Load a template file in the Layout Description drop-down list or from the Tools menu.
How to unload a template Select a non-template setting in the Layout Description drop-down list. The template zones are not removed from the current or existing pages, but template zones will no longer be used for future processing. You can also open the Zone Template Files dialog box, select [none] and click the Set As Current button. In this case, the layout description setting returns to Automatic.
Chapter 4 Proofing and editing Recognition results are placed in the Text Editor. These can be recognized texts, tables and embedded graphics.
The editor display and views The Text Editor displays recognized texts and can mark words that were suspected during recognition with wavy underlines: X Green – Non-dictionary words: These were recognized confidently, but are not found in any active dictionary: standard, user or professional. X Blue – Words with suspect characters: These contain unrecognized characters or are dictionary-approved words containing characters recognized with lower confidence.
Chapter 4 True Page view True Page® view tries to conserve as much of the formatting of the original document as possible. Character and paragraph styling is retained. All page elements, including columns, are placed in boxes and frames. Reading order can be displayed by arrows. See from page 72. The formatting level for export is chosen separately at export time. Proofreading OCR results After a page is recognized, the recognition results appear in the Text Editor.
3. If the recognized word is correct, click Ignore or Ignore All to move to the next suspect word. Click Add to add it to the current user dictionary and move to the next suspect word. 4. If the recognized word is not correct, modify the word in the Edit panel or select a dictionary suggestion. Click Change or Change All to implement the change and move to the next suspect word. Click Add to add the changed word to the current user dictionary and move to the next suspect word. 5.
Chapter 4 To do this: Use this: Turn verifier on F9 or verifier tool Turn verifier off Esc or F9 or verifier tool Turn verifier on/off temporarily F8: press and hold down Show verifier until next keystroke Double-click on word Zoom display in Alt + Num + or click in verifier Zoom display out Alt + Num – or click in verifier Make verifier dynamic or docked/floating Alt + Num / Dynamic context (scroll through 3 values) Alt + Num * The verifier tool is in the Formatting toolbar.
User dictionaries The program has built-in dictionaries for many languages. These assist during recognition and may offer suggestions during proofing. They can be supplemented by user dictionaries. You can save any number of user dictionaries, but only one can be loaded at a time. Your user dictionaries from Microsoft Word are also available; a dictionary called Custom is the default user dictionary for Microsoft Word.
Chapter 4 Training Training is the process of changing the OCR solutions assigned to character shapes in the image. It is useful for uniformly degraded documents or when an unusual typeface is used throughout a document. Training will be less useful for texts with random distortions. Here is an example, based on the letter “g”, which can be printed in different ways: The first two examples do not need training, because both shapes are normal for the letter “g” and the program can handle them.
finds candidate words to change, the Check Training dialog box lists these. Incorrect words should be re-trained before the list is approved. For guidance on using the Train Character and Check Training dialog boxes, please consult their context-sensitive help or the online help topic Manual training and its related topics. IntelliTrain IntelliTrain is an automated form of training. It takes input from the corrections you make during proofing.
Chapter 4 IntelliTrain remembers the training data it collects, and adds it to any manual training you have done. This training can be saved to a training file for future use with similar documents. Training files If you want to be prompted to save your unsaved training data when you close the document, select that option in the Proofing panel of the Options dialog box. Unsaved training data is stored in an OmniPage Document.
You are editing your unsaved training. This frame is grayed. It has been deleted. To undelete it, select it again and press the Delete key. Characters marked as deleted are really deleted when you close the dialog box. Double-click a frame or press Enter to change its OCR solution. Enter the new solution in the text box that appears and press Enter. Changed assignations appear in red. This frame is selected. The top part shows the shape from the image. The bottom part shows the assigned OCR solution.
Chapter 4 between paragraphs. The Text Editor’s horizontal ruler lets you define indent and tab positions easily. Advanced tab settings are done in the Tabs dialog box from the Format menu. Paragraph styles Paragraph styles are auto-detected during recognition. A list of styles is built up and presented in a selection box on the left of the Formatting toolbar. Use this to assign a style to selected paragraphs.
Frames have gray borders and enclose one or more boxes. They are placed when a visible border is detected in an image. Format frame and table borders and shading with a shortcut menu or by choosing Table... in the Format menu. Text box shading can be specified from its shortcut menu. To call up a shortcut menu, right-click inside an element away from a marked word. Multicolumn areas have pink borders and enclose one or more boxes.
Chapter 4 Click the on-the-fly tool with a green signal. The zoning changes will cause changes in the Text Editor. Click the Perform OCR button to have the whole page (re)recognized, including your zone changes. For details on how changes are handled in on-the-fly zoning and their effects in the Text Editor views, see On-the-fly processing in online Help.
The Text-to-Speech facility is enabled or disabled with the Tools menu item Speech Mode or with the F5 key. A second menu item Speech Settings... allows you to select a voice (for example, male or female for a given language), a reading speed and the volume. The three basic speech keys are grouped together on the numeric keypad.
Chapter 5 Saving and exporting Once you have acquired at least one image for a document, you can export the image(s) to file. Once you have recognized at least one page, you can export recognition results – a single page, selected pages or the whole document – to a target application by saving to file, copying to Clipboard or sending to a mailing application. Saving as an OmniPage Document is always possible.
page is recognized (or proofread, if that was requested), an exporting dialog box appears. You can specify export any time the program is not busy. If you ask to export a document with unrecognized pages, you will be asked whether they should be recognized first. If you answer No, only results from recognized pages will be exported. If zones have been modified on recognized pages, you will be invited to re-recognize those pages before exporting.
Chapter 5 Saving recognition results You can save recognized pages to disk in a wide variety of file types. See “File types for saving recognition results” on page 95. 1. Choose Save As... in the File menu, or click the Export Results button in the OmniPage Toolbox with Save as File selected in the drop-down list. 2. The Save As dialog box appears, as shown in its expanded form. Click Advanced to open the lower panel and Basic to close it.
5. Click OK. The document is saved to disk as specified. If Save and Launch is selected, the exported file will appear in its target application; that is the one associated with the selected file type in your Windows system or in the advanced saving options for your selected file type converter.
Chapter 5 The Save As dialog box lists available file types in its Save as Type dropdown list. The OmniPage Document is the last format in the list. If you first save the document as an OmniPage Document (for instance as memo.opd), then modify it and later save it to a text file (for instance as memo.txt), then modify it again and click Save, the recent changes are saved to the memo.txt file, not to the OPD.
True Page (TP) This keeps the original layout of the pages, including columns. This is done with text, picture and table boxes and frames. This is offered only for target applications capable of handling these. True Page formatting is the only choice for XML export and for all PDF export, except to the file type ‘PDF Edited’. Spreadsheet This exports recognition results in tabular form, suitable for use in spreadsheet applications.
Chapter 5 Click Defaults to have all settings returned to the default values for the current file type. Click Save to have the changed settings applied to the current save and also stored as the settings to be applied in future whenever this file type is selected again for saving. The program currently associated with the chosen file type for the Save and Launch feature is displayed at the bottom of the dialog box. Click the three dots button to specify a different program.
Saving to PDF You have five choices when saving to Portable Document Format (PDF) files. PDF (Normal): Pages are exported as they appeared in the Text Editor in True Page view. The PDF file can be viewed and searched in a PDF viewer and edited in a PDF editor. PDF Edited: Use this if you have made significant editing changes in the recognition results. You have three formatting level choices, including True Page. The PDF file can be viewed, searched and edited.
Chapter 5 plain or Unicode text will be pasted. Graphics are retained if the application supports insertion of images. W To copy pages to the Clipboard: • • • With automatic processing, select Copy to Clipboard as the setting in the Export Results drop-down list on the OmniPage Toolbox or in the OCR Wizard. The Copy to Clipboard dialog box appears as soon as the last available page is recognized or proofed.
At any time the program is not busy, choose Send as Mail in the File menu to call up the Send as Mail dialog box. 1. This dialog box lets you specify a file type, a page range, a formatting level and attachment options: one attachment for all pages, one attachment per page, new attachment at each blank page or one attachment for each input file. Set all options and click OK. 2. Log into your mail application if you are prompted to do so. 3.
Chapter 6 Technical information This chapter provides troubleshooting and other technical information about using OmniPage Pro 12. Please also read the online Readme file and other help topics, or visit the ScanSoft web pages. Its scanner section contains detailed and regularly updated information about scanner setup and support. The Readme file contains last-minute information relating to OmniPage Pro. Access to the Readme file and to ScanSoft’s web pages is provided in the Help menu.
Troubleshooting Although OmniPage Pro is designed to be easy to use, problems sometimes occur. Many of the error messages contain self-explanatory descriptions of what to do – check connections, close other applications to free up memory, and so on. Sometimes that is all the troubleshooting help you need. Please see your Windows documentation for information on optimizing your system and application performance.
Chapter 6 Testing OmniPage Pro Restarting Windows 98, Me, 2000 or XP in safe mode or Windows NT in VGA mode allows you to test OmniPage Pro on a simplified system. This is recommended when you cannot resolve crashing problems or if OmniPage Pro has stopped running altogether. See Windows online Help for more information. Your scanner will not run with OmniPage Pro in safe mode or VGA mode, so do not test scanner problems in this configuration.
5. Launch OmniPage Pro and try performing OCR on an image. Use a known image file such as one of the supplied sample files. You can also run OmniPage Pro 12 from a command line in its own safe mode. Choose Start Run, browse for the file OmniPage.exe and add the command line option /safe. This starts the program, but ignores previously stored settings and does not try to recover a document from an abnormal termination. Increasing memory resources OmniPage Pro may run poorly under low-memory conditions.
Chapter 6 Text does not get recognized properly Try these solutions if any part of the original document is not converted to text properly during OCR: ◆ Look at the original page image and ensure that all text areas are enclosed by text zones. If an area is not enclosed by a zone, it is generally ignored during OCR. See the section on creating and modifying zones, “Working with zones” on page 57. ◆ Make sure text zones are identified correctly.
OmniPage Pro only recognizes machine printed-text characters such as typewritten or laser-printed text. It can handle dot-matrix characters, though accuracy may be lower on draft-quality texts. It cannot read handprint or handwriting. However, it can retain signatures or other handwritten text as a graphic. Problems with fax recognition Try these solutions to improve OCR accuracy on fax images: ◆ Ask senders to use clean, original documents if possible.
Chapter 6 ODMA support If your local network includes a Document Management System (DMS) that supports ODMA clients, OmniPage Pro may be able to work with it. Then an ODMA panel will appear in the Options dialog box allowing you to specify permissible file types and other settings. An ODMA interface will replace the Load Image File and Open OmniPage Document (OPD) dialog boxes. This lets you load image files and OPDs one at a time from the network file system or your local computer.
Supported file types The program supports a wide range of file types for images and text.
Chapter 6 File types for saving recognition results This table shows which formatting levels are available for each file type. No ForRFP matting Flowing Page True Page Spread sheet File type Extension eBook (1) opf ● ● Excel 97, 2000 xls ● ● ● Excel 3.0 to 7.0 xls ● ● ● FrameMaker 5.5.3 mif ● ● Freelance Graphics txt ● ● Harvard Graphics txt ● ● HTML 4.0 (2) htm ● ● HTML 3.
Tables ● File type supports tables in grids, no table handling choices at export time ●● File type supports tables, choose to use grids or tab separated columns ❍● File type does not supports table grids, choose to convert to tab or space separated columns 1 These new formats are available only in some editions of the program. 2 When saving to HTML, all graphics are saved as separate JPEG image files. 3 Recognition results are sent to Clipboard in RTF 95/6.
I A N D E X Accuracy improvement, 49, 69, 91 influence of brightness, 50 influence of training, 69 scanning mode influence, 49 Acquire Text menu items, 45 Acquired pages, 26 Acquiring images, 21, 40 Adding pages to a document, 39 to zones, 58 training to training files, 71 words to a user dictionary, 66 ADF, 31, 48, 50 Advanced saving options, 82 Advice on problems, 88 Alphanumeric zone, 55 Attachments to mail messages, 85 Auto-detect layout, 51 Automatic Document Feeder (ADF), 31, 48, 50 Automatic proce
to target applications, 21, 40, 78 True Page, 82 F Fax recognition, 92 Features, new, 17 Files as export target, 78 as image source, 48 retained on uninstalling, 96 separation options, 79, 86 types, 79 types for export, 81, 95 types supported, 94 Finding non-dictionary words, 65 suspect words, 65 Finishing a document, 39 Floating toolbars, 23 Flowing Page, 81 Folder input for Schedule OCR, 93 Formatting levels, 47, 64, 95 Formatting levels and file types, 95 Formatting toolbar, 22, 23 Frames, 24, 73, 82, 9
ODMA support, 93 OmniPage Desktop, 22 OmniPage Documents contents of, 80 definition, 29 purpose of OPD files, 30 saving as, 30, 80 OmniPage Pro documents in, 21 earlier versions, 13 installing, 13 new features of, 17 registering, 17 reinstalling, 96 starting, 14 testing, 89 uninstalling, 96 OmniPage Toolbox, 22, 25, 38 Online HTML Help, 9 registration, 17 On-the-fly editing and zoning, 75 OPD files definition, 29 purpose of, 30 saving to, 30 Opening image files, 48, 94 Optical character recognition, 20 Opti
Scanning black-and-white, 49 books, 31 brightness, 31, 50 color, 49 contrast, 31 grayscale, 49 input from, 49 pictures, 49 Wizard, 14 Schedule OCR, 47 input from folders, 93 watched folders, 93 Searching PDF output, 84 Selecting multiple pages, 26 Send Mail dialog box, 85 Sending pages by mail, 85 Setting up a scanner, 14 Setting up Direct OCR, 45 Settings Acquire Text, 45 effect of settings, 32 for Direct OCR, 45 in OCR Wizard, 44 in Options dialog box, 31 zone types, 59 Shortcut menus, 56 Single-column pa