OmniPage® Pro Users Manual CAERE CORPORATION 100 Cooper Court Los Gatos, California 95032 European Offices: Caere GmbH Innere Wiener Strasse 5 81667 Munich Germany
Please Note In order to use this program, you should know how to work in the Microsoft Windows environment. Please refer to Windows documentation if you have questions about how to use menu commands, dialog boxes, scroll bars, edit boxes, and so on. OmniPage Pro for Windows Version 8 Copyright© 1997 Caere Corporation. All rights reserved.
Welcome Welcome to OmniPage Pro, and thank you for buying our software! The following documentation has been provided to help you learn about OmniPage Pro. This Users Manual This manual introduces you to the basics of using OmniPage Pro. It includes an introduction to OmniPage Pro, installation and setup instructions, task-oriented instructions, ways to customize processing, settings guidelines, and technical information.
Using This Manual Using This Manual This manual is written with the assumption that you know how to work in the Microsoft Windows environment. Please refer to your Windows user’s manual or online help if you have questions about how to use dialog boxes, menus, and so on. The following conventions are used in this manual. Convention Purpose Italicized text • Emphasizes menu commands, dialog box options, labeled buttons, and file names For example: “Choose Open... in the File menu.
Chapter 1 Introduction to OmniPage Pro You probably use your computer for most business correspondence and other written projects. The problem is that certain sources of information cannot be immediately used on a computer. For example, if you want to incorporate information from a magazine article into a document in your word processor, you somehow have to get the text from the article into your computer. Painstakingly retyping the article is not an appealing solution.
What Is Optical Character Recognition (OCR)? What Is Optical Character Recognition (OCR)? Optical character recognition (OCR) is the process of turning an image into computer-editable text. An image is an electronic picture of text such as a scanned paper document or an electronic fax file. Images do not have editable text characters; they have many tiny dots (pixels) that together form a picture of text. During OCR, OmniPage Pro analyzes an image and defines characters to produce editable text.
What Is Optical Character Recognition (OCR)? Basic Steps of OmniPage Pro OCR These are the basic steps of OmniPage Pro’s OCR process. 1 Bring a document image into OmniPage Pro. You can scan a paper document, load an image file, or load a fax from Microsoft. The resulting image appears in OmniPage Pro’s image viewer. See “Bringing Document Images into OmniPage Pro” on page 23 for more information. 2 Create zones to identify areas you want to recognize as text or retain as graphics.
The OmniPage Pro Desktop The OmniPage Pro Desktop OmniPage Pro’s desktop displays the pages of a document in its thumbnail viewer, image viewer, and text viewer. You can use buttons in the Standard, AutoOCR, and Zone toolbars to perform various tasks on the document. Zone toolbar Standard toolbar AutoOCR toolbar The thumbnail viewer displays a picture of each page in the document. The current page has a box around it.
The OmniPage Pro Desktop AutoOCR Toolbar The AutoOCR toolbar contains buttons that can activate each step of the OCR process. AUTO Image Zone OCR Export button button button button button Click the down arrow to display the commands in a buttons drop-down list. Set commands in the AutoOCR toolbar buttons for the operations you want to perform. You can choose commands in a buttons’s drop-down list. • The AUTO button allows you to activate automatic processing or use the OCR Wizard.
The OmniPage Pro Desktop Standard Toolbar The Standard toolbar contains buttons and drop-down lists for performing various tasks. New Save Check Copy Undo Recognition Open Print Cut Paste Image Rotate Editor Image View Options Zoom Straighten Help Image Zone Toolbar The Zone toolbar contains buttons that allow you to draw and define zones on a page image.
The OmniPage Pro Desktop Options Dialog Box You can select settings for OmniPage Pro in the Options dialog box. To open it, click the Options button or choose Options... in the Tools menu. Click the tabs in the Options dialog box to view and select different settings. See Chapter 4, OmniPage Pro Settings, for more information on settings.
Getting Online Help Getting Online Help After installing OmniPage Pro, you can use its online help system to get information on features and procedures. Please refer to your Windows documentation to learn more about using Windows online help systems. Help Menu Use commands in the Help menu to open topics that provide information on features and procedures. • Choose OmniPage Pro Help Topics to get contents and index listings for OmniPage Pro help topics.
Product Support Product Support For the fastest and easiest way to get help, please look for solutions in this manual or in the online help. For troubleshooting tips, see “General Troubleshooting Solutions” on page 86. If you need additional help, product support and information are available to registered users through the services listed in this table. Service How to Contact World Wide Web home page http://www.caere.
Chapter 2 Installation and Setup This chapter provides installation and setup information for OmniPage Pro and the Scan Manager. For technical and troubleshooting information, please read Chapter 6, Technical Information. For specific scanner information, please read the Scanner Setup Notes included in your OmniPage Pro package.
Minimum System Requirements Minimum System Requirements You need the following setup, at minimum, to install and run OmniPage Pro: • Computer with a 486 or higher processor • Microsoft Windows 95 or Windows NT 4.
Setting Up Your Scanner with OmniPage Pro 2 Click Next to continue with installation. 3 Follow the onscreen instructions to finish installation. During installation, you are prompted to enter a serial number. You can find the serial number on the label of the CD-ROM. Setting Up Your Scanner with OmniPage Pro To use your scanner with OmniPage Pro, you must install the Scan Manager and select your scanner. You are prompted to do this during OmniPage Pro’s regular installation.
Starting OmniPage Pro To change your scanner selection in the Scan Manager: 1 Make sure your scanner is turned on when you start your computer. 2 Close OmniPage Pro if it is open. 3 Click Start in the Windows taskbar and choose Settings Control Panel. 4 Double-click the Caere Scan Manager 3.0 icon to open the Scan Manager. 5 Click the Select Scanner tab. 6 Select the name of the scanner you want to use in the Supported Scanners list box. 7 Click Set as Current Scanner and then click Apply.
Registering OmniPage Pro Registering OmniPage Pro Registering your copy of OmniPage Pro entitles you to product support, notification of special offers, and the lowest price offered on the next OmniPage Pro upgrade. You can use OmniPage Pro for 25 sessions without registering it. The Register dialog box appears the 26th time you launch OmniPage Pro, and the program exits if you do not register at that time.
Registering OmniPage Pro The Registration menu disappears from the menu bar after you register. To register OmniPage Pro at Caeres Web site: 1 Click the Register menu to open the Register dialog box. You will need to enter your Opens a help topic that serial and key numbers. When you get your registration number, enter it here. provides instructions and a link to Caeres Web site. 2 Open your Web browser and go to the following address: http://www.caere.
Chapter 3 Processing Documents This chapter describes how to work with documents in OmniPage Pro, including each step of the OCR process. There are different ways to accomplish the same tasks in OmniPage Pro. You can use toolbar buttons or menu commands to start procedures. OmniPage Pro can perform all OCR steps automatically, or you can start each step individually. You can even do different tasks at the same time.
Ways to Process Documents Ways to Process Documents Optical character recognition (OCR) is the process of turning an image into computer-editable text so you do not have to retype the text manually. Chapter 1 explains the basic steps of OmniPage Pro’s OCR process. The following is a summary of those steps. 1 Bring a document image into OmniPage Pro. See page 23 for more information. 2 Create zones to identify areas you want to recognize as text or retain as graphics. See page 26 for more information.
Ways to Process Documents Automatic Processing Use the AUTO button to process a new document from start to finish or finish processing an open document. To process your document automatically: 1 Set AutoOCR as the command in the AUTO button’s dropdown list. 2 Set the desired Image, Zone, OCR, and Export commands. See “Setting AutoOCR Toolbar Commands” on page 43 for more information. 3 Choose Options... in the Tools menu and check that settings are appropriate for your document.
Bringing Document Images into OmniPage Pro Bringing Document Images into OmniPage Pro You can bring document images into OmniPage Pro by: • Scanning Pages • Loading Image Files • Loading Exchange Faxes Scanning Pages You can scan paper documents to convert them to electronic images in OmniPage Pro. If a document is already open, scanned pages are inserted as new pages. To scan in OmniPage Pro, you must install the Scan Manager and select your default scanner.
Bringing Document Images into OmniPage Pro Loading Image Files An image file is an electronic picture of text, such as a scanned paper document or an electronic fax, that is saved in an image file format such as PCX or TIFF. You can load image files into OmniPage Pro. If a document is already open, loaded image files are inserted as new pages. The following procedure is for loading image files only. To open an OmniPage Document ( PHW), use the Open... command in the File menu.
Bringing Document Images into OmniPage Pro 6 Click Open when you have selected all the files you want to load. Image files are loaded in the order selected and combined into one working document. Loading Exchange Faxes You can load fax images into OmniPage Pro from Microsoft Exchange or Outlook if you have the Microsoft Fax component installed with those applications. Please see Microsoft documentation for information on configuring these applications.
Creating Zones for OCR Creating Zones for OCR Page images are displayed in OmniPage Pro’s image viewer where zones are created before OCR. Zones are borders that identify areas of an image that will be recognized as text or retained as graphics. Any part of an image not enclosed by a zone is ignored during OCR. This is a text zone. It will be converted to This is a graphic zone. It text during OCR. will be kept as a graphic image during OCR. All unzoned areas of the page will be ignored during OCR.
Performing OCR on a Document You can also choose HP AccuPage — an advanced Hewlett Packard scanning and zoning technology — as the zone setting if your scanner supports it and HP AccuPage is selected in the Scan Manager. 2 Click the Zone button or choose Auto Zones in the Process menu. OmniPage Pro automatically draws zones on the current page in the image viewer. Each zone has a number indicating its order and a letter indicating its zone properties.
Checking OCR Results 3 Set OCR and Check as the command in the OCR button’s dropdown list. Or, set Perform OCR as the command if you do not want error checking to begin automatically after OCR. 4 Click the OCR button. The page is recognized according to the current zones and settings. If there are no zones on the page, zones are created according to the current command in the Zone button. To schedule a group of documents for OCR at a particular time, see “Scheduling OCR” on page 79.
Checking OCR Results 2 Select one of these options for the word: • Click Ignore to allow the word to remain as is. • Click Ignore All to ignore all instances of the word in the current document. • Click Change to replace the word with the word in the Change to edit box. • Click Change All to replace all instances of the word with the word in the Change to edit box. • Click Add to add the word to the current user dictionary.
Checking OCR Results Checking OCR Results in Microsoft Word You can check for OCR errors directly in Microsoft Word 7 or Microsoft Word 97 if you have those versions installed on your computer. To enable this feature, you must select settings in the Microsoft Word section of OmniPage Pro’s Options dialog box. See “Microsoft Word Settings” on page 53 for more information. Make sure the *.doc file extension is associated with the version of Word you plan to use.
Checking OCR Results When the first suspected error is located, the Verify Text window appears displaying the original image of the text. Use these buttons to zoom in or out on the image. original image The Check Recognition dialog box also appears. 4 Select one of these options for the word: • Click Ignore to allow the word to remain as is. • Click Ignore All to ignore all instances of the word. • Click Change to replace the word with the word in the Change to edit box.
Checking OCR Results To verify text against its original image in Microsoft Word: 1 Follow steps 1 and 2 in the preceding instructions if your document is not already open in Microsoft Word. 2 Select a suspect word. Suspect words are marked in the color that was selected in the Microsoft Word section of OmniPage Pro’s Options dialog box. You can only verify words that are marked as suspected errors.
Using OCR in Other Applications Using OCR in Other Applications You can use OmniPage Pro's OCR Aware feature to use OCR in other applications. For example, you can scan, recognize, and paste text directly into a word-processing document without ever leaving the application. You can use OCR Aware with 32-bit (and some 16-bit) applications that have been registered with OmniPage Pro. An application must be installed on your computer in order to use it with OCR Aware.
Working with Documents Working with Documents OmniPage Pro’s thumbnail, image, and text viewers allow you to look at and work with pages in the current document. Thumbnail viewer Image viewer Drag this splitter to the left Text viewer or right to resize a view.
Working with Documents Saving a Document as You Work Click the Save button in the Standard toolbar or choose Save in the File menu to save changes to the current document as you work. The first time a document is saved, the Save As dialog box appears. See “Saving a Document” on page 39 for more information. If a document has been saved as an OmniPage Document ( PHW), all the changes you make in the open document are saved.
Working with Documents Changing Pages The thumbnail viewer, image viewer, and text viewer all display the same page in a document. You can change pages in a document in the following ways: • Click the thumbnail of the page you want to display. The thumbnail of the currently displayed page has a box around it. • Click the Next Page or Previous Page buttons at the lower-right corner of the OmniPage Pro desktop. • Choose Next Page, Previous Page, or Go to Page... in the Edit menu.
Working with Documents Reordering Pages You can reorder pages in a document by dragging their thumbnails to different positions in the thumbnail viewer. Click the thumbnail of the page you want to move and drag it above the desired page number. Hold down the Ctrl key while you click thumbnails if you want to select multiple thumbnails to move as a group. Deleting Pages If you delete a page from a document in OmniPage Pro, the thumbnail, original image, and recognized text for that page are all deleted.
Working with Documents Undoing Changes You can click the Undo button or choose Undo in the Edit menu to cancel the very last change you made in the text viewer. You can also choose Undo to cancel zone deletions in the image viewer. However, page deletions cannot be undone. Printing a Document You can print the current document's original page images or recognized text. To print a document: 1 Choose Print... in the File menu and choose one of the following in the submenu: • Choose Image...
Exporting Documents Exporting Documents You can export a document to other applications by: • Saving a Document • Copying a Document to the Clipboard • Sending a Document as a Mail Attachment After you export a document, a copy of the document remains open in OmniPage Pro. Save the document as an OmniPage Document ( PHW) if you want to reopen it in OmniPage Pro again. OmniPage Documents retain all original images, zones, and recognized text.
Exporting Documents 4 Click OK. The document is saved to disk as specified. Graphics and formatting are saved in the document only if the selected file type supports them. To save original images: 1 Choose Save Image... in the File menu. The Save Image dialog box appears. 2 Select a folder location and file type for your document. See “Supported File Formats” on page 89 for a complete list of supported file types. 3 Type in a file name and select Save and Image options. 4 Click OK.
Exporting Documents Text formatting, such as bold and italics, is retained when you paste into an application that supports RTF information. Otherwise, only plain text will be pasted. Graphics are retained if the application supports bitmap images. Sending a Document as a Mail Attachment You can send a recognized document as a file attached to a mail message if you have a MAPI-compliant mail application, such as Microsoft Exchange or Outlook, installed.
Chapter 4 OmniPage Pro Settings This chapter describes the settings in the AutoOCR toolbar and Options dialog box. Please look in OmniPage Pro’s online help for more detailed information on settings. The settings you select for processing documents can greatly affect OCR results. You may have to experiment with different settings to get the results you want. Settings guidelines are provided at the end of the chapter to get you started.
Setting AutoOCR Toolbar Commands Setting AutoOCR Toolbar Commands The AutoOCR toolbar buttons allow you to take a document through each step of the OCR process. Every toolbar button has different process commands that can be set for the operations you want to perform. OmniPage Pro can go through all steps automatically, or you can start each step individually.
Setting AutoOCR Toolbar Commands Image Button Commands Use the Image button to bring a document image into OmniPage Pro’s image viewer. The Image button’s drop-down list contains the Load Image, Load Exchange Fax, and Scan Image commands. Load Image Select Load Image to load existing image files such as TIFF or PCX files. Load Exchange Fax Select Load Exchange Fax to load faxes from Microsoft Exchange or Outlook.
Setting AutoOCR Toolbar Commands Zone Button Commands Use the Zone button to automatically create zones on document images. Zones are boxes that specify what will be recognized as text or retained as graphics on an image. The Zone button’s drop-down list contains the Single-Column Pages, Multiple-Column Pages, Tables, Mixed Pages and HP AccuPage commands and the names of any zone templates you have created. See “Creating Zones for OCR” on page 26 for more information.
Setting AutoOCR Toolbar Commands OCR Button Commands Use the OCR button to perform the selected OCR operation on document images. The OCR button’s drop-down list contains the Perform OCR, OCR and Check, Train OCR, and Defer OCR commands. Perform OCR Select Perform OCR to recognize text on document images. During OCR, OmniPage Pro analyzes the image and identifies characters to produce editable text. See “Performing OCR on a Document” on page 27 for more information.
Setting AutoOCR Toolbar Commands Send Mail Select Send Mail to send a recognized document as a file attached to a mail message if you have a MAPI-compliant mail application, such as Microsoft Exchange or Outlook, installed. See “Sending a Document as a Mail Attachment” on page 41 for more information. Copy to Clipboard Select Copy to Clipboard to place a copy of a recognized document on the Clipboard. See “Copying a Document to the Clipboard” on page 40 for more information.
Selecting OmniPage Pro Settings Selecting OmniPage Pro Settings Click the Options button or choose Options... in the Tools menu to open the Options dialog box. This is the central location for OmniPage Pro settings. Click each tab to view and select different settings. Click for a description of each setting. Documents require different settings depending on their input attributes and your output goals. To get the best results, learn how to identify document attributes and make selections for them.
Accuracy Settings Accuracy Settings Click the Accuracy tab to select settings that affect OCR accuracy the most. Language Analyst evaluates and replaces unknown words with words most likely to be correct during OCR. Select Small Training files help text if you are recognize special processing a characters during OCR. page containing text that is < 6 pt. Select a brightness setting to account for variations in paper and print quality when you scan.
Page Format Settings Page Format Settings Click the Page Format tab to select settings that determine how the formatting of a page is handled during OCR. Select a setting that describes how your original page looks. Select a setting to determine what your page will Click to select look like after OCR. font options for recognized text. Language Settings Click the Language tab to select language settings for your document. Select the documents main language.
OCR Aware Settings OCR Aware Settings Click the OCR Aware tab to select settings for the OCR Aware feature. OCR Aware allows you to initiate OCR from another application. See “Using OCR in Other Applications” on page 33 for more information. OCR Aware allows you to initiate OCR from another application. An application must be registered to work with OCR Aware. If your application is not listed, click Browse... to Click Register locate the application file *.exe) ( Office 97...
Process Settings Process Settings Click the Process tab to set commands and settings for each step of OCR. The OCR Wizard will guide you through the OCR process when you click AUTO. Specifies where newly loaded or scanned images are added to an open document.
Microsoft Word Settings Microsoft Word Settings Click the Microsoft Word tab to select settings for performing check recognition directly in Microsoft Word. See “Checking OCR Results in Microsoft Word” on page 30 for more information. Select this if you want to check for OCR errors in Microsoft Word. Select the color in which you want suspected errors to appear in Microsoft Word. Checking recognition in Microsoft Word is only supported in Microsoft Word versions 7 and 97. Make sure you associate the *.
Settings Guidelines Settings Guidelines The settings you select in OmniPage Pro can greatly affect OCR results. Make sure that settings are appropriate for your document before you begin processing. You may have to experiment with different settings to get the results you want. Answer the following questions to get settings recommendations for your documents.
Settings Guidelines What type of document are you processing? Magazine and newspaper pages Recommendations Select Multiple columns in the Page Format settings. Select the appropriate page size and orientation in the Scanner settings if you are scanning. Draw zones manually or modify automatically created zones if auto zoning does not successfully create zones around all page areas you want to process. See Customizing Zones on page 65, for more information.
Settings Guidelines What type of document are you processing? Legal documents Recommendations Select Multiple columns in the Page Format settings if text appears in two or more columns. Select Single column in the Page Format settings if the document has one, page-wide text column. Select the appropriate page size and orientation in the Scanner settings if you are scanning. Draw zones manually or modify automatically created zones to omit unnecessary parts of the page.
Settings Guidelines What is the quality of the original document? Poor or not sure Degraded copies, colored or Recommendations for scanning Select Grayscale with 3D OCR in the Accuracy shaded backgrounds or text, settings if you have a grayscale scanner and your run-together or broken text page contains grayscale graphics, colored characters background, or colored text.
Settings Guidelines How much original formatting do you want to keep? Minimal Keep one font and one font Recommendations size only Select Remove formatting in the Page Format settings. Click Font Mapping ... in the Page Format settings and select one font and one font size to be used for all text. Select ANSI in the Save As dialog box if you want to be able to open the document in any application.
Settings Guidelines How much original formatting do you want to keep? As much as possible Keep font characteristics, Recommendations Select True Page in the Page Format settings to paragraph formatting, column retain the original appearance of a page using formatting and graphic frames. The formatting will be more precise but positioning will be more difficult to edit.
Settings Guidelines Do you want to retain graphics in your document? Yes Keep graphics such as logos Recommendations for scanning Select Grayscale with 3D OCR in the Scanner and photos during OCR settings if you are scanning with a grayscale processing scanner or loading a grayscale image file and you want to retain grayscale graphics. Select Black and white in the Scanner settings if you are scanning line-art drawings.
Settings Guidelines How many languages are in your document? One language Recommendations If your document contains a language that is not installed in OmniPage Pro, you can add languages to OmniPage Pro by uninstalling and then reinstalling it. Select the document language in the Language settings. For faster processing and more accurate results, select only the language that appears in your document in the More than one language Language settings.
Settings Guidelines Are you processing a multi-page document? Yes Recommendations if you have an automatic document feeder (ADF) Select Scan until empty in the Scanner settings to scan a stack of pages at once. Otherwise, you must click the Image button to scan each subsequent page. Select Double-sided pages to scan pages with print on both sides. You will be prompted to turn the stack over. Insert blank pages to separate more than one job within a stack of pages.
Chapter 5 Customizing OCR OmniPage Pro has many features that allow you to customize the way your documents are handled during OCR. This chapter describes how to use these features.
Adjusting Page Images Before OCR Adjusting Page Images Before OCR You can rotate and straighten page images in OmniPage Pro’s image viewer before zoning and OCR take place. This is recommended to improve OCR accuracy on pages that are not oriented correctly. If you need to rotate or straighten a page, be sure to do so before you create zones because all zones are deleted during these operations. To rotate a page image: 1 Click on the page image to make the image viewer active.
Customizing Zones Customizing Zones Zones are borders created around areas of a page image to identify what will be recognized as text or retained as a graphic during OCR. Zones play a big part in determining OCR results. You can create zones automatically, manually, or with a template.
Customizing Zones Drawing Zones Manually You can draw zones manually on a page image using buttons in the Zone toolbar. Rectangular zones are the most common, but you can also draw irregular-shaped zones. To draw rectangular zones: 1 Click the Zone Properties button and select the zone type and content for the zone you are about to draw. See “Changing Zone Properties” on page 71 for more information. 2 Click the Draw Rectangular Zones button.
Customizing Zones 5 Drag the drawing tool to form the first side of your zone. 6 Click the mouse button when you have drawn the desired line length. 7 Draw a perpendicular line in either direction to form the next side of the zone. 8 Repeat steps 6 and 7 to finish drawing each side of your zone. You will not be allowed to draw a line if it constitutes a restricted shape.
Customizing Zones To reorder zones: 1 Click the Reorder Zones button. The numbers in the zones disappear. 2 Click within the zone you want recognized first. The number 1 appears in the zone. 3 Click within the zone you want recognized next. The number 2 appears in the zone. 4 Repeat step 3 until all the zones are appropriately ordered. If you do not number all the zones, they are automatically numbered for you when you start OCR.
Customizing Zones 4 Release the mouse button when you are finished extending the zone. The zone border changes to display the modified zone area. The left area of this zone has been extended downward. To subtract an area of a zone: 1 Click the Subtract from Zone button. The mouse pointer in the image viewer becomes a drawing tool with a minus sign. 2 Position the drawing tool at the point where you want to start subtracting from the zone.
Customizing Zones To connect two or more zones: 1 Click the Add to Zone button. The mouse pointer in the image viewer becomes a drawing tool with a plus sign. 2 Hold the mouse button down and drag the drawing tool over the area where you want the zones to be connected. 3 Release the mouse button when you are done. The zone border changes to display the modified zone area. To divide a zone: 1 Click the Subtract from Zone button.
Customizing Zones Changing Zone Properties You can set certain properties for zones to customize how each zone will be treated during OCR. The Zone Properties dialog box contains settings for zone type and zone content. Close button Zone Type Every zone on a page has a zone type setting.
Customizing Zones To change the properties of a zone: 1 Select the zone you want to modify by clicking it. You can Shift-click to select multiple zones. Selected zones are shaded. 2 Click the Zone Properties button to open the Zone Properties dialog box. Close button The settings in this dialog box will be blank if multiple zones with different settings are selected at once. 3 Select a zone type for the selected zones. 4 Select a zone content for the selected zones.
Specifying Fonts To create zones with a template: 1 Select the zone template that you want to use in the Zone button drop-down list. 2 Click the Zone button or choose Template in the Process menu. OmniPage Pro creates zones on the page image using the zone template. Specifying Fonts You can retain the font characteristics in your document during OCR if you select an Output Format option other than Remove formatting in the Page Format section of the Options dialog box.
Training OCR for Special Characters 3 Click Font Mapping... to open the Font Mapping dialog box. The selected fonts are applied to text when their corresponding font types are detected during OCR. 4 Select the font you want mapped to each font type. The fonts available in the drop-down lists depend on the True Type fonts installed on your system. 5 Click OK when you are done.
Training OCR for Special Characters Original character images OmniPage Pros interpretation of the images 5 Double-click a character you want to train. Or select it and click Specify. Most characters do not need to be trained. Look for uncommon characters such as the copyright symbol ©. The Specify Character dialog box shows how the selected character appeared in the original page image.
Training OCR for Special Characters Training files are saved in the GDWD folder in your installation folder. You can select them in the Accuracy section of the Options dialog box. To edit a training file: 1 Choose Edit Training File... in the Tools menu. A dialog box appears listing all your training files. 2 Double-click the training file you want to edit. Or, select it and click Edit. The Train Character dialog box displays characters in the selected file. 3 Edit the characters as desired.
Creating User Dictionaries Creating User Dictionaries A user dictionary is used when you perform OCR and check for errors afterward. You can select a user dictionary in the Language section of the Options dialog box. To customize a user dictionary: 1 Choose Edit User Dictionary... in the Tools menu. A dialog box lists all user dictionary files. 2 Do one of the following: This is Microsoft Words user dictionary. You can use it with OmniPage Pro. This is OmniPage Pros default user dictionary.
Saving Settings Files Saving Settings Files You can save OmniPage Pro settings to a file. A settings file is useful for quickly loading particular settings that you need for certain documents. The settings you select in OmniPage Pro can greatly affect OCR results. For help in selecting settings for different kinds of documents, see “Settings Guidelines” on page 54. To save settings to a file: 1 Choose Options... in the Tools menu. 2 Select the desired settings in the Options dialog box.
Scheduling OCR To load a settings file: 1 Choose Options... in the Tools menu to open the Options dialog box. 2 Click Load Settings... to open the Load Settings dialog box. 3 Select the folder location of the settings file you want to load. 4 Select the name of the settings file you want to load and click OK. The settings change according to the selected file. 5 Click OK to close the Options dialog box.
Scheduling OCR Scheduling Individual Documents You can schedule individual documents from different folders. Scheduled documents are recognized at the specified time and then saved in the designated output folder. To schedule individual documents: 1 Choose Schedule OCR... in the Process menu. The Schedule OCR dialog box appears. Click Add... to add All scheduled documents documents are to the displayed in this processing processing queue. queue.
Scheduling OCR 5 Select the time that you want OmniPage Pro to process the scheduled documents. Select Finish now if you want OmniPage Pro to process all scheduled documents as soon as you close the dialog box. 6 Click OK in the Schedule OCR dialog box to save your settings as specified. All scheduled files are processed, in order, at the scheduled time. Scheduling Documents from an Input Folder You can set up OmniPage Pro to automatically schedule documents from a specified input folder.
Scheduling OCR 2 Click the Options... button to open the Schedule OCR Options dialog box. The selected output options are used for all newly scheduled documents. Select this to schedule documents in your scanners ADF. Select this to automatically schedule documents in the specified folder. 3 Select Auto add new jobs from folder and select the desired input folder.
Scheduling OCR Modifying Output Options for Documents All newly scheduled documents have the same default output folder and file format assigned to them. The default output file name uses the original file name and the extension of the output file format. You can modify all of these output options for any scheduled document. Click the Options... button in the Schedule OCR dialog box to change the default options used for all newly scheduled documents.
Scheduling OCR 3 Select the desired options for the document. 4 Click OK to accept the selected options. The Schedule OCR dialog box reappears. 5 Click OK to close the Schedule OCR dialog box.
Chapter 6 Technical Information This chapter provides troubleshooting and other technical information about using OmniPage Pro. Please also read the Release Notes and Scanner Setup Notes that came in your OmniPage Pro package. These contain the latest information on OmniPage Pro and its supported scanners.
General Troubleshooting Solutions General Troubleshooting Solutions Although OmniPage Pro is designed to be easy to use, problems sometimes occur. Many of the onscreen error messages contain selfexplanatory descriptions of what to do — check connections, close other applications to free up memory, and so on. Sometimes that is all the troubleshooting help you need. Please see your Windows documentation for information on optimizing your system and application performance.
General Troubleshooting Solutions Testing OmniPage Pro Restarting Windows 95 in safe mode or Windows NT in VGA mode allows you to test OmniPage Pro on a simplified system. This is recommended when you cannot resolve crashing problems or if OmniPage Pro has stopped running altogether. See Windows online help for more information. Your scanner will not run with OmniPage Pro in safe mode or VGA mode, so do not test scanner problems in this configuration.
General Troubleshooting Solutions Low Memory Problems OmniPage Pro may run poorly under low memory conditions. This may be indicated by various error messages or if OmniPage Pro works slowly and accesses the hard drive often. Try these solutions for low memory conditions: • Restart your computer. • Close other open applications to free up memory. • Close unnecessary OmniPage Pro windows. • Defragment your hard disk to free up contiguous blocks of disk space. See Windows online help for instructions.
Using Visioneer Scanners with OmniPage Pro Using Visioneer Scanners with OmniPage Pro During installation, OmniPage Pro automatically integrates with your Visioneer PaperPort software. However, you cannot scan directly into OmniPage Pro if you use a Visioneer scanner or if your scanner is set up to work with PaperPort software (such as the HP ScanJet 5s). Instead, scan pages into PaperPort and then drag the page images onto the OmniPage Pro icon at the bottom of the PaperPort Desktop.
Supported File Formats OmniPage Pro can save recognized text to these file formats: Ami Professional FrameMaker Text Only 2.0, 3.0, 3.1 ANSI HTML Ventura Publisher (MS Word) ANSI Standard Lotus 123 Windows Write 3.x ANSI Stripped Microsoft PowerPoint Word for DOS 5.0, 5.5 ( ASCII UWI) Microsoft Publisher Word for Windows 2.0, 6.0, 7.0, 97 ASCII Standard OmniPage Document ( ASCII Stripped PHW) PageMaker (MS Word) Wordpad WordPerfect 5.0, 5.1, 6.0, 6.
Scanner Setup Issues Scanner Setup Issues This section contains information on scanner setup and solutions for scanning problems you may encounter. For more detailed scanner information, please read the Scanner Setup Notes included in the OmniPage Pro package.
Scanner Setup Issues Scanner Drivers Supplied by Caere OmniPage Pro is shipped with special scanner drivers that allow it to communicate with supported scanners. These scanner driver files are installed on your computer when you install the Caere Scan Manager. These drivers often work in conjunction with the drivers from your scanner manufacturer. In order to use your scanner with OmniPage Pro, you must select the appropriate scanner in the Caere Scan Manager.
Scanner Setup Issues Missing Scan Image Command The Scan Image command does not appear in the Image button’s dropdown list in the following cases: • You did not install the Caere Scan Manager or select an appropriate scanner. See “Setting Up Your Scanner with OmniPage Pro” on page 16 for instructions. • Your scanner is not connected to your computer or is not functioning properly. See “Scanner Setup Issues” on page 91.
Scanner Setup Issues Scanner Not Listed in Supported Scanners List Box Try these solutions if your scanner is not listed in the Scan Manager Supported Scanners list box: • Check Caere Corporation’s web site (www.caere.com) for Scan Manager updates. • Select TWAIN scanner as your current scanner in the Supported Scanners list box. Scanning Tips OCR results will be poor if an image is not scanned properly.
OCR Problems OCR Problems This section contains information and solutions for possible OCR problems. Topics in this section include: • System Crash During OCR • Text Does Not Get Recognized Properly • Problems With Fax Recognition System Crash During OCR Try these solutions if a crash occurs during OCR or if processing takes a very long time: • Resolve low memory problems. See “Low Memory Problems” on page 88 for more information. • Resolve low disk space problems.
OCR Problems Text Does Not Get Recognized Properly Try these solutions if any part of the original document is not converted to text properly during OCR: • Look at the original page image and make sure that all text areas are enclosed by text zones. If an area is not enclosed by a zone, it is ignored during OCR. See “Creating Zones for OCR” on page 26 for more information. • Make sure text zones are identified correctly. Alphanumeric text zones are marked by an A. Graphic zones are marked by a G.
Uninstalling the Software • Ask senders to select Fine or Best mode when they send you a fax. This produces a resolution of 200x200 dpi. • Ask senders to transmit files directly to your computer via fax modem if you both have one. You can save fax images as image files and then load them into OmniPage Pro. See “Supported File Formats” on page 89 for more information. • Ask senders to use clean, original documents if possible.
Uninstalling the Software To uninstall the Caere Scan Manager: 1 Close OmniPage Pro. 2 Click Start in the Windows taskbar and choose Settings Control Panel Add/Remove Programs. 3 Select Caere Scan Manager 3.0 and click Add/Remove. 4 Click OK to confirm that you want to remove the Caere Scan Manager. 5 Restart your computer. Some icons and program files may remain on your system if they have been renamed, modified, or moved to different locations.
Glossary Terms 3D OCR® A technology developed by Caere that uses grayscale information to increase accuracy when recognizing scanned text characters. ADF See automatic document feeder. AnyPage A technology developed and licensed by Caere that improves the combined performance of grayscale scanners and OmniPage Pro. AnyPage uses the quality of grayscale images to improve the recognition of scanned pages. It is especially useful for text printed on shaded backgrounds.
Glossary Terms frame A formatting box containing text or graphics that is used to design page layout. For example, columns in a document may be contained within a separate frame. HP AccuPage® A technology developed and licensed by HewlettPackard that improves the combined performance of HP scanners and OmniPage Pro.
Glossary Terms reject character The character that represents unrecognizable characters in a recognized document. A tilde (~) is the default reject character. For example, if OmniPage could not recognize the J in REJECT, and ~ is the reject character, the string RE~ECT would appear in your document. text viewer The area on the OmniPage Pro desktop that displays recognized text and any graphics.