LEGAL NOTICES Copyright © 2001 by ScanSoft, Inc. All rights reserved. No part of this publication may be transmitted, transcribed, reproduced, stored in any retrieval system or translated into any language or computer language in any form or by any means, mechanical, electronic, magnetic, optical, chemical, manual, or otherwise, without prior written consent from the Legal Department at ScanSoft, Inc., 9 Centennial Drive, Peabody, Massachusetts 01960.
C O N T E N T S WELCOME 1 2 VII Using this manual viii Getting online help ix Online HTML Help ix Context-Sensitive Help ix Tech Notes x Glossary x INSTALLATION AND SETUP 11 System requirements 12 Installing OmniPage Pro 13 Setting up your scanner with OmniPage Pro 14 How to start the program 16 Registering your software 17 New features in OmniPage Pro 11 18 INTRODUCTION 19 What is optical character recognition 20 OmniPage Pro’s OCR capabilities 20 Documents in OmniPage
The Formatting toolbar 24 The OmniPage Toolbox 25 Managing documents Thumbnail view 26 Detail view 27 Customizing columns in Detail view 28 Deleting pages from a document 28 Printing a document 28 Closing a document 29 OmniPage Documents 30 How to save to ODP 30 31 TUTORIAL: PROCESSING DOCUMENTS Quick Start Guide iv CONTENTS 29 Why save to OPD Settings 3 26 33 34 Loading and recognizing sample image files 34 Scanning and recognizing a single page 34 Processing documents using
4 5 Input from image files 48 Input from scanner 49 Scanning with an ADF 50 Scanning long documents without an ADF 51 Describing the layout of the document 51 Manual zoning 53 Working with zones 53 Zone properties 54 Table grids in the image 56 Using zone templates 57 PROOFING AND EDITING 59 Proofreading ocr results 60 Checking recognized text against original 61 User dictionaries 62 IntelliTrain 63 The editor display and views 66 Text and image editing 67 Reading text al
TECHNICAL INFORMATION Troubleshooting 79 80 Solutions to try first 80 Testing OmniPage Pro 81 Low memory problems 82 Low disk space problems 82 Supported file types 83 File types for opening and saving images 83 File types for saving recognition results 84 Saving to PDF 85 OCR problems 86 Text does not get recognized properly 86 Problems with fax recognition 87 System or performance problems during OCR 87 Uninstalling the software vi CONTENTS 88
Welcome Welcome to OmniPage Pro 11, and thank you for using our software! The following documentation has been provided to help you get started and give you an overview of the program. This User’s Manual This manual introduces you to using OmniPage Pro 11. It includes installation and setup instructions, a description of the program’s commands and working areas, task-oriented instructions, ways to customize and control processing, and technical information.
USING THIS MANUAL This manual is written with the assumption that you know how to work in the Microsoft Windows environment. Please refer to your Windows documentation if you have questions about how to use dialog boxes, menu commands, scroll bars, drag and drop functionality, shortcut menus, and so on. We also assume you are familiar with your scanner and its supporting software, and that the scanner is installed and working correctly before it is setup with OmniPage Pro 11.
GETTING ONLINE HELP In addition to using this manual, you can use OmniPage Pro’s online Help to learn about features, settings, and procedures. Online Help is available after you install OmniPage Pro. Online HTML Help Open OmniPage Pro’s online Help at its top level by choosing OmniPage Pro Help Topics at the top of the Help menu. This allows you to see topics arranged in a Table of Contents, search an alphabetical list of keywords or make full-text searches through the topics.
Tech Notes ScanSoft’s web site at www.scansoft.com contains Tech Notes on commonly reported issues using OmniPage Pro 11. Web pages may also offer assistance on the installation process and troubleshooting. Glossary This manual does not include a glossary. The online Help has a comprehensive glossary, with its own alphabetical index and a table of contents. Please consult it if you want to find the meaning of a term used in this Manual or in the program.
1 Installation and setup This chapter provides information on installing and starting OmniPage Pro 11.
SYSTEM REQUIREMENTS You need the following minimum system requirements to install and run OmniPage Pro 11: ◆ A computer with a Pentium or higher processor ◆ Microsoft Windows 95, Windows 98, Windows ME, Windows 2000, or Windows NT 4.0 ◆ 32MB of memory (RAM), 64MB recommended ◆ 75MB of free hard disk space for the application files plus 10MB working space during installation ◆ 9MB for Microsoft Installer (MSI) if not present and 44MB for Internet Explorer if not present.
INSTALLING OMNIPAGE PRO OmniPage Pro 11’s installation program takes you through installation with instructions on every screen. Before installing OmniPage Pro: ◆ Make sure your scanner is connected, turned on, and compatible with your system. ◆ Close all other applications, especially anti-virus programs. ◆ Log into your computer with administrator privileges if you are installing on Windows 2000 or Windows NT.
Note It is planned to provide Text-to-Speech for English (British and US), French, German, Italian, Portuguese and Spanish. This may vary depending on region or version. Please check in the Readme file for latest information. A speech system for only one language can be installed with OmniPage Pro. See also the section Reading text aloud in chapter 5. SETTING UP YOUR SCANNER WITH OMNIPAGE PRO All files needed for scanner setup and support are copied automatically during the program’s installation.
◆ ◆ ◆ ◆ ◆ ◆ ◆ ◆ ◆ ◆ ◆ ◆ Your scanner’s native user-interface will appear. Click on Scan to begin the sample scan. If necessary, click on Inverse Image… or Missing Image… and make the appropriate selections. Once the image appears correctly in the window, click on Next. Select the item that most appropriately describes your scanner, then click on Next. Click on Next to proceed to page size. The page sizes that the Scanner Wizard believes that your scanner supports are listed in the window.
To change the scanner settings at a later time, or to set up a different scanner, or to test and repair an installed scanner, please follow one of these two methods to reopen the Scanner Wizard: Start➤Programs➤ScanSoft OmniPage Pro 11.0➤Scanner Wizard or ◆ Start➤Programs➤ScanSoft OmniPage Pro 11.0➤OmniPage Pro 11.0➤Tools menu➤Options➤Scanner…➤Setup button.
◆ Right-click an image file icon or file name for a shortcut menu. Select a sub-menu item from ‘Convert To...’ to define a target. ◆ Use OmniPage Pro to provide OCR services in ScanSoft’s PaperPort® or Pagis® document management products. See chapter 3. REGISTERING YOUR SOFTWARE ScanSoft’s registration Wizard runs at the end of installation. We provide an easy electronic form that can be completed in less than five minutes.
NEW FEATURES IN OMNIPAGE PRO 11 If you are upgrading, you may not need to consult this Manual very much. Here are some main areas of innovation compared to OmniPage Pro 10: ◆ Greater accuracy - redeveloped recognition engines make OmniPage Pro 11 the most accurate OmniPage® ever. ◆ Improved page layout - OmniPage Pro 11 will allow you to retain formatting that is true to the original, even on pages with nongridded tables, headers and footers and dropped capitals.
2 Introduction You probably use your computer for business correspondence, preparing reports, handling data and an ever-increasing number of other uses. The challenge is that, in spite of the digital revolution, certain sources of information still circulate in printed, paper form and cannot be used immediately in a computer. For example, if you want to incorporate information from a magazine article in a report you are preparing, you somehow have to get the text from the article into your computer.
WHAT IS OPTICAL CHARACTER RECOGNITION Optical character recognition is the process of extracting text from an image. This image can result from scanning a paper document or opening an electronic image file. Images do not have editable text characters; they have many tiny dots (pixels) that together form character shapes. These present a picture of the text on a page. During OCR, OmniPage Pro 11 analyzes the character shapes in an image and defines solutions to produce editable text.
Documents in OmniPage Pro OmniPage Pro 11 handles documents one at a time. When you acquire your first image (from scanner or from file) a new document is started. Further acquired images are added to the same document, until you save and close it. A document in OmniPage Pro consists of one image for each document page. After you perform OCR, the document will also contain recognized text, displayed in the Text Editor, possibly along with graphics and tables.
THE OMNIPAGE PRO DESKTOP OmniPage Pro’s desktop has a title bar and a menu bar along the top and a status bar along the bottom. It has three main working areas, separated by splitters: the Document Manager, the Original Image area and the Text Editor. The Document Manager has two tabbed panels: Thumbnail view and Detail view. The Original Image area has an Image toolbar and the Text Editor has a Formatting toolbar. Formatting toolbar Standard toolbar OmniPage Toolbox The current page has a pale border.
The OmniPage Toolbox lets you control processing. It can have three states, depending which of the three tab buttons on the left is clicked. In the picture, we display its appearance for Manual OCR. We show the program with a three-page document. Page one is the current page, which has been recognized and proofed. Page two has been recognized but not proofed yet. Page three has been acquired and manually zoned, but not recognized yet. The icons at the bottom right of the thumbnail images show page status.
The Image toolbar The Image toolbar contains buttons that allow you to zoom in or out on the current image or to rotate it. They also allow you work with zones and table dividers on the page. This is described in detail in chapter 3, Tutorial: Processing documents. Here we summarize the purpose of the buttons. The Image toolbar can be floated (that is, undocked and moved anywhere on the desktop). It can be docked to any edge of the Original Image area. Draw rectangular zones. Draw irregular zones.
The OmniPage Toolbox This Toolbox lets you drive the processing. By default it is located along the top of the OmniPage Pro desktop, just above the working areas. It can be floated and also be docked along the bottom of the desktop. It has three tabs on the left: AutoOCR™, Manual OCR and OCR Wizard. Click one to see its controls in the Toolbox. The picture at the beginning of this section showed the OmniPage desktop with the Manual OCR toolbar. The AutoOCR toolbar looks like this.
MANAGING DOCUMENTS The Document Manager is situated on the left of the OmniPage Pro desktop. It has two tabbed panels: Thumbnail view and Detail view. Click a tab to see its view. Both views summarize the pages in the document and are synchronized: the current and selected pages remain the same when you switch views. Our pictures show the two views with the same four-page document. Pages 1 and 2 are selected and page 4 is the current page, that is, the one shown in the Original Image area.
Detail view This facility is new to OmniPage Pro 11. It provides an overview of your document with a table. Each row represents one page. Columns present statistical or status information for each page, and (where appropriate) document totals. The picture below shows the default columns on the left and four columns which a user has specified. Move the cursor onto the page’s status icon to see a thumbnail of the page. This shows the number of zones of each type on the page.
Customizing columns in Detail view You can specify which columns of information you want to see in Detail view. Click Customize Details... in the View menu for the following dialog box: This item is highlighted. Click a checkbox to select the item. Image sizes are expressed in pixels. Highlight an item and use these arrows to change the order of columns. Define a width for the highlighted item. Define which columns should appear, their widths, and column order.
Closing a document Choose Close in the File menu to close a document. You are prompted to save your document if you have not saved it or you have modified it since the last save. See the next section on saving the document as an OmniPage Pro Document (*.opd). You will also be prompted to save unsaved training data if you selected ‘Prompt to save IntelliTrain’ data when closing document’ in the Proofing panel of the Options dialog box.
Why save to OPD You do not have to save your documents to the OPD file type. You would typically do this for the following reasons: ♦ ♦ ♦ You cannot finish working with the document in the current session. You want to pass the document to other users who have OmniPage Pro. For example, you can pass an OPD file to a specialist for proofing. In an office network, you may have one scanner generating images for recognition and proofing at several workstations.
SETTINGS The Options dialog box is the central location for OmniPage Pro settings. It has seven panels. Context-sensitive help provides information on each setting. In overview, the settings panels are: OCR Use this to specify recognition language(s), a user dictionary, a reject character, an OCR method (optimize for speed or accuracy) and font matching. Scanner Use this to define page size and orientation for scanning.
Process Use this to define where new images should be placed in the document and set other preferences governing the behavior of the processing. You can change the interface language here. Proofing Use this to define whether proofreading should begin automatically after recognition. Define also whether IntelliTrain should run, and use it to load or work with a training file. For more detail, see chapter 4, Proofing and editing.
3 Tutorial: Processing documents This chapter describes different ways you can process a document and also provides information on key parts of this processing.
QUICK START GUIDE This topic takes you step-by-step through the basic OCR process. Loading and recognizing sample image files You will find sample image files in the program folder, both single-page and multi-page files. First try reading these files using the procedure presented below, except for the references to a scanner. See Input from image files for more information on acquiring the images.
What you do What happens 1. Set up your scanner using the Scanner Wizard, if this is not already done. Configures OmniPage Pro 11 to work with your scanner. 2. Select Start ➤ Programs ➤ ScanSoft OmniPage Pro 11.0 ➤ OmniPage Pro 11.0 Opens OmniPage Pro 11 on your computer. 3. Place the document correctly in your scanner. 4. Check the three tab buttons to the left of the OmniPro Toolbox. The AutoOCR button should be selected. If not, click on it.
Here is an overview of the processing methods you can use. You will find step-by-step guidance for each of them in the following pages. Using the OCR Wizard The OCR Wizard guides you through the selection of settings and commands by asking you questions. It then launches automatic processing. This is a good way to get started if you are new to OmniPage Pro. Automatically The fastest and easiest way to process documents is to let OmniPage Pro do it automatically for you.
PROCESSING DOCUMENTS USING THE OCR WIZARD The OCR Wizard takes you through six settings panels, guiding you to make settings for your document and then launching automatic processing. Context-sensitive help is available for all Wizard panels. The OCR Wizard can run only when there is no document open in OmniPage Pro. Click the OCR Wizard tab in the OmniPage Toolbox and click the Wizard button to see the first wizard screen: 1. The first panel lets you define your document source: scanner or image file.
3. The third panel (shown below) lets you define recognition languages and decide OCR method. Languages with dictionary support have the icon . 4. The fourth panel lets you define the formatting level to be applied to your document for display and export. See chapter 4 for more information. 5. The fifth panel asks if you want to proofread the text before export. If you choose Yes you can also edit the text before saving. You also decide whether to create and use IntelliTrain data during proofing.
7. If you requested proofing and the text contains suspect words, the OCR Proofreader™ dialog box will appear. When proofing is finished or closed, recognition results either go directly to the Clipboard, or the Save As dialog box appears so you can specify file export settings. 8. The document remains in OmniPage Pro. You can edit recognition results and save it again to other formats.
PROCESSING DOCUMENTS AUTOMATICALLY Automatic processing provides an efficient way of handling documents, especially larger ones. First you select all settings needed, then you can use the AutoOCR™ toolbar in the OmniPage Toolbox to process a new document from start to finish or to restart and finish processing on an open document. 1. Click the AutoOCR tab in the OmniPage Toolbox to display the AutoOCR toolbar. 2. Select the desired Get Page command in the drop-down list.
6. Click Start or choose Start in the Process menu. Each page of the document is processed and finished one after the other. The program may perform tasks simultaneously, for instance it may start loading and recognizing a new page as you proofread the previous page. Command buttons Start: This lets you begin automatic processing on a new document. Stop: This lets you interrupt automatic processing. You may do this if you find that some settings need to be changed. Then the Start button changes to Finish.
PROCESSING DOCUMENTS MANUALLY Manual processing gives you more precise control over the way your pages are handled. You can process the document page-by-page with different settings for each page. The program also stops between each step: acquiring images, performing recognition, exporting. This lets you, for instance, draw zones manually on each page. You start each step in the process by clicking the buttons on the Manual OCR toolbar. 1.
6. Select a value for the Perform OCR button. You describe the layout of the incoming pages. This value has an influence if auto-zoning runs on any pages. You can also select a template to have its zones placed on the current page. For more detail see the sections Describing the layout of the document and Using zone templates. 7. Click the Perform OCR button to have the current page recognized.
PROCESSING A DOCUMENT AUTOMATICALLY AND FINISHING IT MANUALLY When you have a large document with only a few pages needing special attention, you do not have to manually process the whole document. You can process it automatically and view results in the Text Editor. You can determine which pages are in order, and which need different settings or some manual zoning. Then you can switch to manual processing to adjust settings and zones and rerecognize just those pages. 1.
PROCESSING FROM OTHER APPLICATIONS You can use the Direct OCR feature to call on the recognition services of OmniPage Pro while you work in your usual word-processor or other application. First you must establish the direct connection with the application. Then, two items in its File Menu open the door to OCR facilities. How to set up Direct OCR 1. Start the application you want connected to OmniPage Pro.
6. If proofing was specified, this follows recognition. Then the recognized text is placed at the cursor position in your application, with the formatting level specified by Acquire Text Settings... . Note If OmniPage Pro is running when Direct OCR is called from a target application, a second instance of OmniPage Pro is launched. How to use OmniPage Pro 11 with your PaperPort software PaperPort® is a paper management software product from ScanSoft. It lets you link pages with suitable applications.
PROCESSING DOCUMENTS WITH SCHEDULE OCR You can schedule OCR jobs to be performed automatically at any time within the following 24 hours. Each job handles one document. The document pages can come from a scanner with an ADF or from image files. You do not have to be present at your computer at job start time, nor does OmniPage Pro have to be running. It does not matter if your computer is turned off after the job is set up, so long as it is running at job start time.
DEFINING THE SOURCE OF PAGE IMAGES There are two possible image sources: from image files and from a scanner. There are two main types of scanners: flatbed or sheetfed. A scanner may have a built-in or added Automatic Document Feeder (ADF), which makes it easier to scan multi-page documents. The images from scanned documents can be input directly into OmniPage Pro or may be saved with the scanner’s own software to an image file, which OmniPage Pro can later open.
Normally the Add button places each file at the bottom of the file list. To place a file at a different location, highlight a file in the list. The new file will be added immediately below the lowest highlighted file. Input from scanner You must have a functioning, supported scanner correctly installed with OmniPage Pro. See chapter 1 for more information. You have a choice of scanning modes.
Brightness and contrast Good brightness and contrast settings play an important role in OCR accuracy. Set these in the Scanner panel of the Options dialog box. The diagram illustrates an optimum brightness setting. After loading an image, check its appearance. If characters are thick and touching, lighten the brightness. If characters are thin and broken, darken it. Then rescan the page.
You can scan double-sided documents with an ADF. A duplex scanner will manage this automatically. For non-duplex scanners, select ‘Scan double-sided pages’ in the Scanner panel of the Options dialog box. Then you can scan the document in just a few passes, with even pages grouped together and odd pages also grouped. OmniPage Pro will merge the pages for you. Scanning long documents without an ADF You can scan multi-page documents efficiently from a flatbed scanner, even without an ADF.
Single column, no table Choose this setting if your pages contain only one column of text and no table. Business letters or pages from a book are normally like this. Choose it also for a page with words or numbers arranged in columns if you do not want these placed in a table or decolumnized or treated as separate columns. Graphics may be detected.
MANUAL ZONING Zones define areas on the page to be processed. Zones are rectangular or irregular (with sides formed by vertical and horizontal lines). Zones cannot overlap. They have a zone number in the top left corner and a zone type icon top right. Click in a zone to select it. Use Shift+clicks for a multiple selection. Current and selected zones are shaded. Click outside a zone to remove the selection. Zones appear on an original image in the following cases: The page has been recognized.
Subtract from zone Click this to subtract irregular parts from an existing zone or split a zone into smaller ones. You cannot move or resize existing zones when this tool is active. You cannot use this with a table type zone. Reorder zones Click this for the zone reordering tool. Then click in zones in the desired reading order. For your order to be respected, choose ‘Use current zones only’ and avoid having multiple-column or auto-detect zones types on the page.
Table zone Use this to have the zone contents treated as a table. Table grids can be automatically detected, or placed manually as described in the next section. Table zones must be rectangular. The Text Editor displays the table in an editable grid. You can choose whether to export tables in grids or in columns separated by tabs. Auto-detect zone Use this to let the program decide the zone type. To do this, auto-zoning runs, which may also result in changed zone order on the page.
TABLE GRIDS IN THE IMAGE After automatic processing you may see table zones placed on a page. They are denoted with a table zone icon in the top right corner of the zone. To change a zone to or from a table zone, use its shortcut menu. You can also draw a table type zone. If there is already a table zone on the page, select it, then draw the new rectangular zone. It will inherit the table type. Otherwise draw a rectangular zone and use its shortcut menu to change it to a table type.
Remove/replace all dividers Click this tool and click inside a table zone. Its dividers will all disappear. Click again to have dividers automatically (re)detected. Divider placement usually occurs during recognition; clicking twice with this tool lets you see and edit the dividers before recognition. USING ZONE TEMPLATES A template is a set of zones, their properties and reading order, stored in a file. A zone template file can be loaded to have template zones used during recognition.
How to unload a template Select a non-template setting for layout description in the Perform OCR drop-down list. The template zones are not removed from the current or existing pages, but template zones will no longer be used for future processing. You can also open the Zone Template Files dialog box, select [none] and click the Set As Current button. In this case, the layout description setting returns to Automatic.
4 Proofing and editing Recognition results are placed in the Text Editor. This newly developed WYSIWYG (What You See Is What You Get) editor offers the following features, detailed in this chapter: ◆ Proofing OCR results ◆ Checking recognized text against original (Verifying text) ◆ User dictionaries ◆ IntelliTrain ◆ Text Editor display and views ◆ Text and image editing ◆ Page outline ◆ Reading text aloud The Text Editor offers four views for displaying its pages.
PROOFREADING OCR RESULTS After a page is recognized, the recognition results appear in the Text Editor. Proofreading starts automatically if that was requested in the Proofing panel of the Options dialog box or in the OCR Wizard. You can start proofing manually any time the program is not busy. Work as follows: 1. Click the Proofread OCR button in the Standard toolbar, or choose Proofread OCR... in the Tools menu. 2. Proofing starts from the beginning of the document, but skips text already proofed.
5. Color markers are removed from words in the Text Editor as they are proofread. You can switch to the Text Editor during proofing to make corrections there. Use the Resume button to restart proofing. Click Close to stop proofreading before the end of the document is reached. Note A page is marked with the proofed icons and Document Manager if proofing ran to the end of the page.
4. Click the Close button to close the verifier window. Tip You should proofread and verify texts before doing large-scale editing. If you cut and paste large blocks of text, the links between text and image may be disturbed. Tip You can use OmniPage Pro’s Text-to-Speech facility to have the recognized text read aloud as another way of verifying text. You can hear the text letter-by-letter, word-by-word, line-by-line, sentence-by-sentence or in whole pages. See the section Reading text aloud.
INTELLITRAIN IntelliTrain is a newly developed and automated form of training. It takes input from the corrections you make during proofing. When you make a change, it remembers the character shape involved, and your proofing change. It searches other similar character shapes in the document, especially in suspect words. It assesses whether to apply the user correction or not. You can turn IntelliTrain on or off in the OCR panel of the Options dialog box.
The following shows how IntelliTrain works, using the original image. Our example involves the letters c and e. With some typefaces and scanning settings, the horizontal line in e can become very thin, leading to OCR errors that IntelliTrain can repair. OmniPage Pro read this as bcnefit. You changed it during proofing to benefit. IntelliTrain remembers this shape and the rule: e This is not c. This is e. IntelliTrain changes: thcrc to there likc to like Whcncvcr to Whenever etc.
Select this, click Save and type in a name to save a new training file. Select this to unload a training file. Click this to edit the selected training file (see below). Use this also to save new training into a loaded training file. It is listed as: File name [modified] Unsaved training can be edited in the Edit Training dialog box, an asterisk is displayed in the title bar in place of a training file name. It remains unsaved when you close the dialog box.
THE EDITOR DISPLAY AND VIEWS The editor displays recognized texts and can mark words that were suspected during recognition. Marking is done with a wavy underline; red underlines for words not found in a dictionary (this applies only to languages with dictionary support) and blue underlines for words containing suspect or reject characters. These markers can be shown or hidden as selected in the Text Editor panel of the Options dialog box.
TEXT AND IMAGE EDITING This is a WYSIWYG Text Editor, providing many editing facilities. These work very similarly to those in leading word processors. Editing character attributes In all views except No Formatting view, you can change the font type, size and attributes (bold, italic, underlined) for selected text. Use the Formatting toolbar or the Font dialog box from the Format menu. The latter also offers subscripts, superscripts and colored text or backgrounds.
Graphics You can edit the contents of a selected graphic zone if you have an image editor in your computer. Click Edit Picture in the Tools menu. This will activate the image editor associated with BMP files in your Windows system, and load the graphic. Edit the graphic, then close the editor to have it reembedded in OmniPage Pro’s Text Editor. Do not change the graphic’s size, resolution or type, because this will prevent the reembedding. Tables Tables are displayed in the Text Editor in grids.
To hear text: Use these keys: One character at a time, forward or back Right or left arrow. Letter, number or punctuation names are spoken.
You also have the following keyboard controls: To do this: Use this: Pause/Resume Ctrl + Numpad 5 Set speed higher Ctrl + Numpad + Set speed lower Ctrl + Numpad - Restore speed Ctrl + Numpad * It is planned to provide speech programs for the following languages, English, French, German, Italian, Portuguese and Spanish. Please consult the Readme file for the latest information.
5 Saving and exporting Once you have acquired at least one image for a document, you can export the image(s) to file. Once you have recognized at least one page, you can export recognition results to a target application by: 1. Saving the recognition results to file. 2. Copying recognition results to Clipboard. 3. Sending results as one or more mail attachments. The document remains in OmniPage Pro after export.
PREPARING RECOGNITION RESULTS FOR EXPORT Text is exported to file, Clipboard or mail with the formatting level defined by the view set in the Text Editor at export time, if that is possible. However, some export file types and target applications cannot support all formatting elements. You may be warned if there is a mismatch and offered the highest permissible view. You can accept that, or cancel export, set a different view and restart the export.
SAVING TO FILE You can save recognized pages and original images to disk in a wide variety of file types. See chapter 6 for a complete list of supported file types for saving images and recognition results. Saving original images 1. Choose Save Image... in the File menu. In the dialog box that appears, select a folder location and a file type for your images. Type in a file name. 2. Select to save the current image only or all images in the document.
Saving recognition results 1. Choose Save As... in the File menu, or click the Export Results button in the Manual OCR toolbar with Save as File selected in the drop-down list. 2. The Save As dialog box appears, as shown in its expanded form. Click Advanced to open the lower panel and Basic to close it. Select this to automatically open the saved file in its target application. Choose from: Create one file for all pages Create one file per page Create a new file at each blank page.
Note Graphics and formatting are saved in the document only if the selected file type supports them. The formatting level for export is the Editor view set at saving time. You will be warned if the formatting level is not supported by the export file type. Note If more than one export file is created, OmniPage Pro will append a numerical suffix to your file name to create unique file names.
If you first save the document as an OmniPage Document (for instance as memo.opd), then modify it and later save it to a text file (for instance as memo.txt), then modify it again and click Save, the recent changes are saved to the memo.txt file, not to the OPD. When you close the document or exit the program, you will be prompted to save the document if it has not been saved as an OmniPage Document, or there are changes since the last OPD save.
SENDING A DOCUMENT AS A MAIL ATTACHMENT You can send recognition results as one or more files attached to a mail message if you have installed a MAPI-compliant mail application, such as Microsoft Outlook. " To send a document by e-mail • With automatic processing, select Send as Mail as the command in the Export Results drop-down list on the AutoOCR toolbar. The Send Mail dialog box appears as soon as the last available page in the document is recognized or proofed.
3. Your mail application appears with the attachment(s) in a new empty message. Attachments take the name used for the last save of the document in OmniPage Pro, or ‘Untitled from OmniPage’. The suitable file extension is added, and numerical suffixes for multiple attachments. 4. Address your mail message, add message text as desired and click the Send button.
6 Technical information This chapter provides troubleshooting and other technical information about using OmniPage Pro 11. Please also read the online Readme file and other help topics, or visit the ScanSoft web pages. The Scanner Information web page contains detailed and regularly updated information about scanner setup and support. The Readme file contains last-minute information relating to OmniPage Pro. Access to the Readme file and to ScanSoft’s web pages is provided in the Help menu.
TROUBLESHOOTING Although OmniPage Pro is designed to be easy to use, problems sometimes occur. Many of the error messages contain self-explanatory descriptions of what to do – check connections, close other applications to free up memory, and so on. Sometimes that is all the troubleshooting help you need. Please see your Windows documentation for information on optimizing your system and application performance.
Testing OmniPage Pro Restarting Windows 95, 98, 2000 or ME in safe mode or Windows NT in VGA mode allows you to test OmniPage Pro on a simplified system. This is recommended when you cannot resolve crashing problems or if OmniPage Pro has stopped running altogether. See Windows online Help for more information. Note Your scanner will not run with OmniPage Pro in safe mode or VGA mode, so do not test scanner problems in this configuration.
5. Launch OmniPage Pro and try performing OCR on an image. Use a known image file such as one of the supplied sample files. Note You can also run OmniPage Pro 11 from a command line in its own safe mode. Choose Start ➤ Run, browse for the file OmniPage.exe and add the command line option /safe. This starts the program, but ignores previously stored settings and does not try to recover a document from an abnormal termination. Low memory problems OmniPage Pro may run poorly under low-memory conditions.
Remove Windows applications that you do not use. ◆ Defragment your hard disk. See Windows online Help for instructions. ◆ Clear the cache for your web browser and limit its size. ◆ SUPPORTED FILE TYPES The program supports a wide range of file types. Several important types have been added in OmniPage Pro 11. File types for opening and saving images File type Extension Multipage Open / Save B/W, Grayscale, Color BMP, Bitmap *.bmp No Open and Save All DCX *.dcx Yes Open and Save All GIF *.
Note Saving to PDF format is supported, with four options. One of these is to export image only. But this exports the recognition results as images, not the original images. This is done in the Save As dialog box. See the section Saving to PDF. File types for saving recognition results File type Extension Format levels (Text Editor views) Supports graphics ASCII text 1 *.txt/.csv No Formatting view (NFV) No Adobe PDF, normal *.pdf True Page Yes Adobe PDF with image substitutes *.
3 Recognition results are sent to Clipboard in this format and will be pasted in RTF if possible, and as Unicode or ASCII text if not. 4 Unicode text can handle the widest range of accented characters. 5 True Page or Retain Flowing Columns (RFC) views will not be refused, but will appear as Retain Fonts and Paragraphs (RFP) view, this is, without columns. 6 OmniPage Documents can be reopened by OmniPage Pro.
OCR PROBLEMS This section contains information and solutions for possible OCR problems. First we provide suggestions for improving recognition accuracy, second on getting good results from fax input and finally on system or performance problems arising during OCR. Text does not get recognized properly Try these solutions if any part of the original document is not converted to text properly during OCR: ◆ Look at the original page image and ensure that all text areas are enclosed by text zones.
If you use True Page as the Text Editor view or for export, recognized text is put into frames (formatting boxes). Some text may be hidden if a frame is too small. To view the text, place the cursor in the text frame and use the arrow keys on your keyboard to scroll to the top, bottom, left, or right of the frame. ◆ Check the glass, mirrors, and lenses on your scanner for dust, smudges, or scratches. Clean if necessary.
Break complex page images (lots of text and graphics or elaborate formatting) into smaller jobs. Draw zones manually or modify automatically created zones and perform OCR on one page area at a time. See the section in chapter 3 on creating and modifying zones. ◆ Restart Windows 95, 98 and ME and 2000 in safe mode, or Windows NT and in VGA mode and test OmniPage Pro by performing OCR on the included sample image files. See the section Testing OmniPage Pro.
I A Accuracy, 49 Accuracy improvement, 31, 63 Acquire Text menu item, 45 Acquire Text Settings, 45 Acquired page, 26 Acquiring images, 21, 42 Add Job Wizard, 47 Adding pages to a document, 41 Adding to zones, 53 Adding words to a user dictionary, 60 ADF, 31, 48, 50 Alignment of paragraphs, 24 Alphanumeric zone, 54 Area reordering, 70 ASCII text output, 84 Attachments to mail messages, 77 Auto-detect zone, 51, 55 Automatic and manual processing, 25 Automatic Document Feeder (ADF), 31, 48, 50 Automatic proce
E G Earlier versions of OmniPage Pro, 13 Editing a training file, 65 Editing a user dictionary, 62 Editing character attributes, 67 Editing graphics, 68 Editing paragraph attributes, 67 Editing PDF output, 85 Editing recognized text, 24, 67 Editing table dividers, 24, 56 Editing table grids, 56 Editing tables, 68 Effect of settings, 32 Export, 21, 42, 71, 72 Export file types, 84 Export Results button, 40, 43 Export to file, 74 Export, preparing for, 72 Exporting graphics, 74 Exporting to Clipboard, 76 G
Minimum system requirements, 12 Modifying a zone template, 57 Moving between pages, 26 Moving table dividers, 56 MS Outlook, 77 Multi-page image files, 48, 73, 83 Multiple column pages, 52 Multiple column zone, 54 Online HTML Help, ix Online registration, 17 OPD files, 29 Opening image files, 83 Optical character recognition, 20 Optimizing image quality, 50 Options dialog box, 31 Original Image area, 22 Original image saving, 73 Overview of document, 26 Proofed page, 26 Proofing documents in later session
Rows in tables, 56 S Safe mode, 81 Sample images files, 81 Save and Launch, 74 Saving a document as you work, 30, 75 Saving a training file, 65 Saving a zone template, 57 Saving as OmniPage Document, 30, 75 Saving documents, 71 Saving original images, 73, 83 Saving recognition results, 74 Saving to file, 38, 73 Saving to OPD format, 30 Saving to PDF, 85 Scan black-and-white, 49 Scan color, 49 Scan grayscale, 49 Scan Wizard, 14 Scanner, 49, 87 Scanner brightness, 31 Scanner contrast, 31 Scanner drivers, 14