User’s Guide
LEGAL NOTICES Copyright © 2009 Nuance Communications, Inc. All rights reserved. No part of this publication may be transmitted, transcribed, reproduced, stored in any retrieval system or translated into any language or computer language in any form or by any means, mechanical, electronic, magnetic, optical, chemical, manual, or otherwise, without prior written consent from Nuance Communications, Inc., 1 Wayside Road, Burlington, Massachusetts 01803-4609.
C O N T E N T S WELCOME 5 New features in OmniPage 17 INSTALLATION AND SETUP System requirements Installing OmniPage Setting up your scanner with OmniPage How to start the program Registering your software Activating OmniPage Uninstalling the software USING OMNIPAGE DOCUMENTS Processing methods Defining the source of page images Describing the layout of the document Preprocessing Images Zones and backgrounds PROOFING 11 11 12 14 16 17 18 18 20 OmniPage Documents The OmniPage Desktop and Views
User dictionaries Languages Training Text and image editing On-the-fly editing Marking and redacting Reading text aloud Creating and editing forms SAVING 60 61 63 65 67 68 69 71 74 AND EXPORTING Saving and Exporting Saving original images Saving recognition results Sending pages by mail Sending to Kindle Other export targets 74 75 76 81 82 84 WORKFLOWS 85 Workflow Assistant Batch Manager Creating new jobs Watched folders Watched mailboxes Barcode processing File-it Assistant TECHNICAL 88 90 91 96
Welcome Welcome to this OmniPage® 17 text recognition program, and thank you for choosing our software! The following documentation has been provided to help you get started and give you an overview of the program. This User’s Guide This guide introduces you to using OmniPage 17. It includes installation and setup instructions, a description of the program’s commands and working areas, task-oriented instructions, ways to customize and control processing, and technical information.
How-to-Guides The How-to-Guides can be accessed from the Help menu. They are a series of mini-guides that help you get started easily by providing concise overviews of key program areas, such as getting input, image improvement, zoning, recognition, editing, proofreading, new features, and the like. Electronic Help OmniPage Help contains information on features, settings, and procedures. It also has a comprehensive glossary, with its own alphabetical index and a table of contents.
Tech Notes The web site at www.nuance.com contains Tech Notes on commonly reported issues using OmniPage 17. Web pages may also offer assistance on the installation process and troubleshooting. New features in OmniPage 17 If you are upgrading from version 16, you benefit from the following innovations. Click the links to for more information.
not only fast file loading but also 'one-click' total processing: load > recognize > save. See “Input via Easy Loader” on page 36. • Expanded ECM support: New links are available to Hummingbird from OpenText and iManage from Interwoven. When using SharePoint, the server, login and password information must be provided only once per session, and is offered in each subsequent session.
• Other improvements: Advances to image pre-processing provide better layout retention and overall accuracy – particularly in XPS files and document-to-document conversions. HD photo (JPEG XR) image loading is now supported. Integration with Microsoft Word, Excel and PowerPoint is enhanced. Linearized PDF files can be created, so they are optimized for faster web viewing. Form layout description is now available in Quick Convert View.
• Customizable shortcut menus in Windows Explorer: send image files or PDFs directly to major Windows programs, process them with your own workflows, or use the Convert Now Wizard for easy conversion control. • General improvements: these include faster processing, better quality output page layout (font matching, table detection, etc.); and a new, intuitive Workflow Assistant. Key features unique to OmniPage Professional.
Installation and setup This chapter provides information on installing and starting OmniPage. System requirements The minimum requirements to install and run OmniPage 17 are: • A computer with an Intel® Pentium® III processor or equivalent. Dual-core or Quad-core support recommended. • Windows® XP 32-bit (from Service Pack 3) with 400 MHz processor, or Windows® VistaTM 32-bit (SP1) or 64-bit (SP1) with a 1 GHz processor. Windows 7. • 256MB of memory (RAM), 1GB recommended for advanced performance.
• A CD-ROM drive for installation or web access suitable for download. • A Windows compatible pointing device. • 2 megapixel digital camera or higher for digital camera text capture. See Help for details. • A compatible scanner with its own scanner driver software, if you plan to scan documents. See the Scanner Guide at Nuance’s web site (www.nuance.com) for a list of supported scanners.
• If you own a previous version of OmniPage, or if you are upgrading from demonstration software or an OmniPage Special Edition, you must uninstall that product first. To install OmniPage: 1. Download the program file and choose Run when the download is completed, or insert the OmniPage CD-ROM in your CD-ROM drive. The installation program should start automatically. If it does not start, locate your CD-ROM drive in Windows Explorer and double-click the Autorun.exe program at the top-level of the CD-ROM.
Setting up your scanner with OmniPage All files needed for scanner setup and support are copied automatically during the program’s installation, but no scanner setup occurs at installation time. Before using OmniPage 17 for scanning, your scanner should be installed with its own scanner driver software and tested for correct functionality. Scanner driver software is not included with OmniPage. Scanner setup is done through the Scanner Setup Wizard. You can start this yourself, as described below.
• The wizard reports whether the chosen scanner model already has settings in the scanner database. If it does, you do not need to test it. If it does not, you should test it. Click on Next. • If you chose not to test, click Finish. If you chose testing, click Next to have the scanner connection tested. If the connection is in order, you see a menu of further tests. Choose which testing steps you want to run. The Basic test scan is recommended.
To change the scanner settings at a later time, or to setup or remove a scanner, reopen the Scanner Setup Wizard from the Windows Start menu or from the Scanner panel of the Options dialog box. To test and repair an improperly functioning scanner, open the wizard and select ‘Test the current scanner or digital camera’ in the second panel, then work through the procedure described above, maybe using advice received from Technical Support.
On opening, OmniPage’s title screen is displayed and then a view selection panel. OmniPage has three basic view types. For details, see The OmniPage Desktop and Views in the next chapter. It provides an introduction to the program’s main working areas. There are several ways of running the program with a limited interface: • Use the Batch Manager program. Click Start in the Windows taskbar and choose All Programs > Nuance > OmniPage 17 > OmniPage Batch Manager. See the Workflows chapter.
installation, you will be periodically invited to register later. You can go to www.nuance.com to register online. Click on Support and from the main support screen choose Register in the left-hand column. For a statement on the use of your registration data, please see Nuance’s Privacy Policy. Activating OmniPage You will be invited to activate the product at the end of installation. Please ensure that web access is available.
To uninstall or reinstall OmniPage: • Close OmniPage. • Click Start in the Windows taskbar and choose the Control Panel and then Uninstall a program (in earlier Windows versions: Add/Remove Programs). • Select OmniPage and click Uninstall (in earlier Windows versions: Remove). • Click Yes in the dialog box that appears to confirm removal. • Select Yes to restart your computer immediately, or No if you plan to restart later. • Follow instructions until the process is finished.
Using OmniPage OmniPage 17 uses optical character recognition (OCR) technology to transform text from scanned pages or image files into editable text for use in your favorite computer applications. In addition to text recognition, OmniPage can retain the following elements and attributes of a document through the OCR process.
more portable. To embed a file, open the relevant dialog box from the Tools menu, select the desired file and click Embed. Use the Extract button to get a local copy of an embedded file inside an OPD you have received. When you open an OmniPage Document, its settings are applied, replacing those existing in the program. The OmniPage Desktop and Views OmniPage comes with three different views to suit your task. • Classic View - This view has a similar look and feel to previous versions of OmniPage.
position, double-click its title bar. To dock it to a new location, drag it to an edge. A purple rectangle shows the docking position release the mouse button to dock it. To move a floating panel without docking displays, keep CTRL pushed while dragging. To see all possible docking positions one after the other (tiles and tabs), drag the panel over the OmniPage main window, holding down the left mouse button and pressing the spacebar repeatedly.
OmniPage toolbox: This Toolbox lets you drive the processing. Thumbnails panel: This displays page thumbnails. Document Manager: This provides an overview of your document with a table. Each row represents one page. Columns present statistical or status information for each page, and (where appropriate) document totals. Page Image: This displays the image of the current page with its zones. When a page is displayed, the Image toolbar is available.
Suggested scenarios: Maximizing workspace (single screen) Load a document. Open the panels you want to use. Grab them by their captions one by one, and drag them so that they dock behind the active one as tabs. You can also dock Help to avoid handling two separate windows. Working with recognition results (single screen) Load a document and have it recognized. Close all panels except the Document Manager and the Text Editor.
Verifying (dual-screen) Place the Page Image on one screen and the Text Editor on the other. This gives you more space for editing and proofing. The Page Image is always available for verifying recognition and for performing on-the-fly zoning and editing. The scenarios presented above are only examples to give you an idea of what you can do in Flexible View. Quick Convert View Use the Quick Convert View for fast recognition and saving.
The Easy Loader is by default on a tab that toggles with the Quick Convert Options panel. A Help panel can be added, but further panels are not available in this view. You can change tabs to separate panels and minimize them, as in other views. After loading a file, you should convert it before loading the next file. When an image conversion is finished, you do not need to explicitly close the image; just load a new file. The Easy Loader in Quick View provides an additional feature: ‘oneclick’ processing.
used. The Help topic on display remains unchanged regardless of view. Easy Loader retains its file location regardless of view and the Workflow Status continues to display information on the last workflow run. On program restart, Help displays the Welcome topic, Easy Loader the default folder location and Workflow status is empty. The Toolbars The program has eleven main toolbars. Use the View menu to show, hide or customize them.
The Form toolbars and the Mark Text toolbar (for details see Chapter 4) appear only in OmniPage Professional 17. Basic Processing Steps There are three ways of handling documents: with automatic, manual or workflow processing. The basic steps for all processing methods are broadly the same: 1. Bring a set of images into OmniPage. You can scan a paper document with or without an Automatic Document Feeder (ADF) or load one or more image files. 2. Perform OCR to generate editable text.
How to use OmniPage with PaperPort The PaperPort® program is a paper management software product from Nuance. It lets you link pages with suitable applications. Pages can contain pictures, text or both. If PaperPort exists on a computer with OmniPage, its OCR services become available and amplify the power of PaperPort. You can choose an OCR program by right-clicking on a text application’s PaperPort link, selecting Preferences and then selecting OmniPage 17 as the OCR package.
Processing documents This tutorial chapter describes different ways you can process a document and also provides information on key parts of this processing. Processing methods Using OmniPage, you can choose from the following processing methods: Automatic A fast and easy way to process documents is to let OmniPage do it automatically for you. Select settings in the Options dialog box and in the OmniPage Toolbox drop-down lists and then click Start.
1. Use button one to get a set of images. 2. Manually zone pages where you want to process only part of the page or if you want to give precise zoning instructions. Use ignore backgrounds or zones to exclude areas from processing. Use process backgrounds or zones to specify areas to be autozoned. 3. Use button two to have the pages recognized. 4. Do proofing and editing as desired. 5. Use button three to save your results.
Its shortcut menu lists your workflows. Click a workflow to launch OmniPage and have it run. Let the Workflow Assistant guide you in creating new workflows. It provides a choice of steps and the settings they need. Click Next after each step to add another one. You can use the Assistant just to get more guidance when doing automatic processing. See “Workflow Assistant” in Chapter 6.
Nuance OCR tab, or in an OmniPage toolbar open the door to OCR facilities. How to set up Direct OCR Start the application you want connected to OmniPage. Start OmniPage, open the Options dialog box at the General panel and select Enable Direct OCR. In the target application, use the Acquire Text Settings button in the OmniPage toolbar (in Office 2007 go to the Nuance OCR tab). Select options in the following panels: • OCR: languages, dictionaries, layout, fonts.
3. Use the OmniPage toolbar button Acquire Text or the same item in the File menu (use the Nuance OCR tab in Office 2007) to acquire images from the specified source. 4. If you selected Draw zones automatically in the Direct OCR panel of the Options dialog box, under Acquire Text Settings, recognition proceeds immediately. 5. If Draw zones automatically is not selected, each page image will be presented to you, allowing you to draw zones manually.
button or use the Process menu. The lower part of the dialog box provides advanced settings, and can be shown or hidden. The minimum width or height for an image file is 16 by 16 pixels; the maximum is 8400 pixels (71cm or 28 inches at the resolution 201 to 600 dpi). See Help for pixel limits. You can govern how PDF files are opened under Tools / Options / Process: open with the text layer or as image, import tag information to assist layout retention and whether to use PDF fonts or the mapped system fonts.
Input via Easy Loader This provides the Windows Explorer interface in an OmniPage window. In Flexible and Quick Views it appears by default. Choose Easy Loader in the Window menu to add it to Classic View or to show or hide it in other views. It lets you browse your whole file system and efficiently select files to be loaded into OmniPage. Choose Process / Easy Loader / Folder to view files as Lists, Thumbnails, Tiles, Icons (arranged as desired) or Details, as you do in Explorer.
Easy Loader is available as a panel in Quick Convert View. The Process menu has two commands unique to Quick View. • Get and Convert offers 'one-button' processing - files are loaded, passed through recognition and saved to files using existing settings. Only in this case, multiple file selection is allowed with Quick View; the result is one output document for each input file – before starting you should choose under Output file name Same as the source file name.
Input from scanner You must have a functioning, supported scanner correctly installed with OmniPage 17. You have a choice of scanning modes. In making your choice, there are two main considerations: • Which type of output do you want in your export document? • Which mode will yield best OCR accuracy? Scan black and white Select this to scan in black-and-white. Black-and-white images can be scanned and handled quicker than others and occupy less disk space.
the page. If your scanning results are still not satisfactory, open the scanned image in the Image Enhancement window to edit it using a range of different tools. Scanning with an ADF The best way to scan multi-page documents is with an Automatic Document Feeder (ADF). Simply load pages in the correct order into the ADF. You can scan double-sided documents with an ADF. A duplex scanner will manage this automatically.
Document-to-document conversion In OmniPage Professional 17 you can open not only image files, but also documents created in wordprocessing and similar applications. Supported file types include .doc, .xls, .ppt, .rtf, .wpd and others. Click the Load Files button in the OmniPage Toolbox or select the Load Files command under Get Page, in the File menu. In the Load Files dialog box, choose Documents. When you are finished, you can choose from a wide variety of document file types for saving.
Multiple columns, no table Choose this if some of your pages contain text in columns and you want this decolumnized or kept in separate columns, similar to the original layout. Single column with table Choose this if your page contains only one column of text and a table. Spreadsheet Choose this if your whole page consists of a table which you want to export to a spreadsheet program, or have treated as single table.
Template Choose a zone template file if you wish to have its background value, zones and properties applied to all acquired pages from now on. The template zones are also applied to the current page, replacing any existing zones. If auto-zoning yielded unexpected recognition results, use manual processing to rezone individual pages and re-recognize them. Preprocessing Images To improve OCR results, you can enhance your images before zoning and recognition using the Image Enhancement tools.
appearance. If characters are thick and touching, lighten the brightness. If characters are thin and broken, darken it. Use the OCR Brightness tool to optimize the image. Unsuitable Tolerable Good Best Good Tolerable Unsuitable Image Enhancement Tools The Image Enhancement tools can also be used to edit images to save and use them as image files. Note that some these tools work on the Primary image, others on the one used for OCR (OCR image).
The following tools are accessible on the toolbar; their usage is detailed as follows: P - affects Primary image only. O - affects OCR image only. PO - can be applied to either the Primary or OCR image (or both) P+O - a single action is applied to both the Primary and OCR image. P/O - affects both images. WH - applies to whole images only. AR - can be applied to selected image areas.
Synchronize Views - click this tool to zoom and scroll the inactive view to the same zoom value and scroll position as the active view. To make the inactive view dynamically follow the focus of the active one, click View then choose the Keep Synchronized command. PO. WH. The following SET tools allow you to modify image contents: Brightness and Contrast - click this tool to adjust the brightness and contrast of your primary image or a selected part of it.
Resolution - use this tool to decrease the resolution of your primary image in percentages. Note that you cannot adjust a resolution higher than that of the original one. P. WH. Deskew - sometimes pages are scanned crookedly. To straighten the lines of text manually, use the Deskew tool. (Auto-deskew is also available in the Process panel of Options.) P+O. WH. 3D Deskew - use this tool to remove perspective distortion from digital camera images.
but they are not done until you click the Apply button next to the History list. Modifications not added to the History by clicking the Add button will not be applied. Any time you want to see what output a certain step resulted in, double click it in the History list. To discard changes you have performed with a given tool, but before applying it, select the step in the list, then click the Reset button.
Apply enhancement template - an already saved enhancement template will be applied automatically to the image while being processed by the workflow. Apply enhancement template and display - the workflow will apply the selected image enhancement template, and will also display the image so that you can make further edits to it. Zones and backgrounds Zones define areas on the page to be processed or ignored. Zones are rectangular or irregular, with vertical and horizontal sides.
Process background tool (shown) to set a process background. Draw ignore zones over parts of the page you do not need. After recognition the page will return with an ignore background and new zones round all elements found on the background. Auto-zoning vertical text If you set Japanese, Korean or Chinese as the recognition language, auto-zoning will find text blocks and detect the text direction. Vertical Asian text appears horizontally in the Text Editor, but can be exported as vertical - see Chapter 5.
properties. Select multiple zones with Shift+clicks to change their properties in one move. The Image toolbar provides zone drawing tools, one for each type. Process zone Use this to draw a process zone, to define a page area where auto-zoning will run. After recognition, this zone will be replaced by one or more zones with automatically determined zone types. Ignore zone Use this to draw an ignore zone, to define a page area you do not want transferred to the Text Editor.
Table zone Use this to have the zone contents treated as a table. Table grids can be automatically detected, or placed manually. Table zones should be rectangular. Vertical texts in tables cannot be zoned manually – they can be auto-detected in gridded tables. Graphic zone Use this to enclose a picture, diagram, drawing, signature or anything you want transferred to the Text Editor as an embedded image, and not as recognized text.
To resize a zone, select it by clicking in it, move the cursor to a side or corner, catch a handle and move it to the desired location. It cannot overlap another zone. To make an irregular zone by addition draw a partially overlapping zone of the same type. To join two zones of the same type draw an overlapping zone of the same type (drawn zones on the left, resulting zone on the right). To make an irregular zone by subtraction draw an overlapping zone of the same type as the background.
Table grids in the image After automatic processing you may see table zones placed on a page. They are denoted with a table zone icon in the top left corner of the zone. To change a rectangular zone to or from a table zone, use its shortcut menu. You can also draw table type zones, but they must remain rectangular. You draw or move table dividers to determine where gridlines will appear when the table is placed in the Text Editor.
With manual processing the template zones in the first two cases can be viewed and modified before recognition. With automatic processing the template zones can be viewed and modified only after recognition. With workflow processing, use the zone images step. This combines two steps: load templates and manual zoning. To use a zone template, click the Add button in the appropriate panel of the Workflow Assistant, and select the zone template file to use.
How to unload a template Select a non-template setting in the Layout Description drop-down list. The template zones are not removed from the current or existing pages, but template zones will no longer be used for future processing. You can also open the Zone Template Files dialog box, select [none] and click the Set As Current button. In this case, the layout description setting returns to Automatic.
Proofing and editing Recognition results are placed in the Text Editor. These can be recognized texts, tables, forms and embedded graphics. This WYSIWYG (What You See Is What You Get) editor is detailed in this chapter. Asian text handling is in some respects different from other languages. See “Asian language recognition” on page 61. The editor display and formatting levels The Text Editor displays recognized texts and can mark words that were suspected during recognition with red, wavy underlines.
Plain Text This displays plain decolumnized left-aligned text in a single font and font size, with the same line breaks as in the original document. Formatted Text This displays decolumnized text with font and paragraph styling. True Page True Page® tries to conserve as much of the formatting of the original document as possible. Character and paragraph styling is retained. Reading order can be displayed by arrows.
Change All to implement the change and move to the next suspect word. Click Add to add the changed word to the current user dictionary and move to the next suspect word. 5. Color markers are removed from words in the Text Editor as they are proofread. You can switch to the Text Editor during proofing to make corrections there. Use the Resume button to restart proofing. Click Page Ready to skip to the next page and Document Ready or Close to stop proofreading before the end of the document is reached. 6.
To turn the Verifier on, click the Verifier tool or press F9. To turn it off, click the Verifier tool again, press F9 again, or press Esc. A full list of verifier keyboard shortcuts is available in Help. The Character Map The Character Map is a dockable tool giving you aid in proofing. It is used for essentially two purposes: • to insert characters during proofing and editing that are not or not easily accessible from your keyboard. In this respect, it is very similar to the system Character Map.
User dictionaries The program has built-in dictionaries for many languages. These assist during recognition and may offer suggestions during proofing. They can be supplemented by user dictionaries. You can save any number of user dictionaries, but only one can be loaded at a time. A dictionary called Custom is the default user dictionary for Microsoft Word.
Languages The program can read over 120 languages with multiple alphabets: Latin, Greek, Cyrillic, Chinese, Japanese and Korean. See the full language list in the OCR panel of the Options dialog box. It shows which languages have dictionary support. A listing is also provided on the Nuance web site. In addition to user dictionaries, specialized dictionaries are available for certain professions (currently medical, legal and financial) for some languages.
in different orientations. The program can handle these; in the output they appear right-rotated. Beside the language list the option Verify language choices invokes automatic language detection that warns of differences between a detected language and the language setting. It works at page-level and identifies four categories: Japanese, Chinese, Korean and nonAsian. It cannot distinguish between Traditional and Simplified Chinese or between non-Asian languages.
under Options/OCR, a default font is automatically applied typically Arial Unicode MS. Other Asian-capable fonts on your system can be chosen in the Text Editor. Editor support allows text viewing and verifying - Formatted Text is recommended as formatting level. Large-scale editing and spell-checking are better done in the target application. Proofing, training and dictionary support are not available for Asian texts.
Manual training To do manual training, place the insertion point in front of the character you want to train, or select a group of characters (up to one word) and choose Train Character... from the Tools menu or the shortcut menu. You will see an enlarged view of the character(s) to be trained, along with the current OCR solution. Change this to the desired solution and click OK. The program takes this training and examines the rest of the page.
Saving training to file, loading, editing and unloading training files are all done in the Training Files dialog box. Unsaved training can be edited in the Edit Training dialog box, an asterisk is displayed in the title bar in place of a training file name. Save it in the Training Files dialog box. A training file can be also edited; its name appears in the title bar. If it has unsaved training added to it, an asterisk appears after its name.
Editing character attributes In all formatting levels except Plain Text, you can change the font type, size and attributes (bold, italic, underlined) for selected text. Editing paragraph attributes In all formatting levels except Plain Text, you can change the alignment of selected paragraphs and apply bulleting to paragraphs. Paragraph styles Paragraph styles are auto-detected during recognition. A list of styles is built up and presented in a selection box on the left of the Formatting toolbar.
Editing in True Page Page elements are contained in text boxes, table boxes and picture boxes. These usually correspond to text, table and graphic zones in the image. Click inside an element to see the box border; they have the same coloring as the corresponding zones. The Help topic True Page provides details on the operations summarized here. Frames have gray borders and enclose one or more boxes. They are placed when a visible border is detected in an image.
Two linked tools on the Image toolbar control on-the-fly zoning. One of these tools is always active whenever no recognition is in progress. Click this to activate on-the-fly editing. The red signal shows there are no stored zoning changes. Click this to turn on-the-fly editing off. Your zoning changes are stored; the on-the-fly tool displays a green signal to show there are stored changes. To activate these changes, do one of the following: Click the on-the-fly tool with a green signal.
copy, both the copy and the original remain open in OmniPage, ready to be saved. WARNING: If you redact the original document, you cannot retrieve the information you have blacked out. To find and redact text by searching, select Find and Mark Text from the Edit menu to display the Find, Replace and Mark Text dialog box. Search for text to be marked for redaction. Step through all occurrences and decide for each case whether to redact immediately or mark for redaction.
Previous line Up arrow Current sentence Ctrl + Numpad 2 From insertion point to end of sentence Ctrl + Numpad 6 From start of sentence to insertion point Ctrl + Numpad 4 Current page Ctrl + Numpad 3 From top of current page to insertion point Ctrl + Home From insertion point to end of current page Ctrl + End Previous, next or any page Ctrl + PgUp, PgDown or navigation buttons Typed characters Each typed character is pronounced separately.
All speech systems will be installed with OmniPage 17 if you choose a complete installation. If you perform a custom installation, you can choose the languages you need. Creating and editing forms You can bring paper or static electronic forms (distributed mainly as PDF in an office environment) into OmniPage Professional 17, recognize them and edit their content, layout or both - in True Page.
Graphic: Use this tool to select areas of your form that are to be treated as graphics. Fill text: Click this tool to create fillable text fields. These are fields where you want people to enter text. Comb: Use this tool to create a text field consisting of boxes. This is typically used for information such as ZIP codes. Checkbox: Click this tool and draw Checkboxes - typically for Yes/No questions and marking one or more choices.
Editing Form object properties To edit a form object directly select it then right-click the given element to display its shortcut menu. You can edit the appearance or the properties of any form element here. Use the following commands: Form Object Appearance - use the tabs Borders, Shading and Shadow to design the look of your form elements in a similar way as you would do in a text-editing application.
Saving and exporting Once you have acquired at least one image for a document, you can export the image to file. Once you have recognized at least one page, you can export recognition results. After further recognition you can save a single page, selected pages or the whole document by saving to file, copying to Clipboard or sending to a mailing application. Saving as an OmniPage Document is always possible.
click the Export Results button to begin export. You can also perform exporting through the Process menu. Saving original images You can save original images to disk in a wide variety of file types with or without image enhancement (using the Image Enhancement Tools). 1. Choose Save to File in the Export Results drop-down list. In the dialog box that appears, select Image under Save as. 2. Choose a folder location and a file type. Type in a file name. 3.
Saving recognition results You can save recognized pages to disk in a wide variety of file types. 1. Choose Export Results... in the File menu, or click the Export Results button in the OmniPage Toolbox with Save to File selected in the drop-down list. 2. The Save to File dialog box appears. Select Text under Save as. 3. Select a folder location and a file type for your document. Select a page range, file options, naming options and a formatting level for the document.
The formatting levels are: Plain Text This exports plain decolumnized left-aligned text in a single font and font size. When exporting to Text or Unicode file types, graphics and tables are not supported. You can export plain text to nearly all file types and target applications; in these cases graphics, tables and bullets can be retained. Formatted Text This exports decolumnized text with font and paragraph styling, along with graphics and tables. This is available for nearly all file types.
When exporting to Microsoft Excel, 'Spreadsheet' is good for saving whole-page tables. Prefer 'Formatted Text' if your document contains smaller tables: each table will be placed on a separate worksheet with non-table parts placed in an index worksheet with hyperlinks to each relevant worksheet Selecting converter options Click the Options... button in a saving dialog box to have precise control over the export.
followed by all image converters. Checkmark the desired ones. Optionally specify sub-folder paths for each file type. You can save pages with different formatting levels or file options to the different file types, as defined in their simple converters. A few saving operations cannot be done with multiple converters. These are: Saving OmniPage Documents Use a workflow with two saving steps, or perform two separate saves.
including True Page. The PDF file can be viewed, searched and edited. PDF Searchable Image (formerly PDF Image on Text): The PDF file is viewable only and cannot be modified in a PDF editor. The original images are exported, but there is a linked text file behind each image, so the text can be searched. A found word is highlighted in the image.
PDF MRC Use this high compression technology for good quality and smaller file size. Available for color and grayscale PDF Images or PDF Searchable Images. Linearized PDF Choose this to create PDF files optimized for fast loading and display when embedded in web pages. Password protection In OmniPage Professional you can set a type and level of encryption and then define an Open password and/or a Permissions password for PDF files.
Sending pages by mail You can send page images or recognized pages as one or more files attached to a mail message if you have installed a MAPI-compliant mail application, such as Microsoft Outlook. To send pages by e-mail: • With automatic processing, select Send in Mail as the setting in the Export Results drop-down list on the OmniPage Toolbox. The Export Options dialog box appears as soon as the last available page in the document is recognized or proofed.
2. Choose Kindle Assistant in the Tools menu. 3. Type in a name for the new workflow. 4. Choose a document source: Scan, Load files or Load digital camera files. With file input, you will be prompted to choose input files when the workflow starts running. 5. Enter the e-mail address linked to your Kindle reader. 6. Provide a name for the output file. All recognition results enter a single file. 7.
Please note that at the moment (May 2009) this Kindle service is available from Amazon only in the United States of America. Therefore the Kindle Assistant appears only if English is set as the program interface language. Other export targets Turn recognized text into an audio wave file for later listening, using Nuance RealSpeak. A multiple converter is useful for this, allowing you to save the document to file and generate the wave file in one saving step.
Workflows A workflow contains a series of processing steps and their settings. It can be saved for repeated use whenever you have a task needing the same processing. Workflows usually begin with a scanning or loading step, but they can also start from the document currently open in OmniPage. After that, they do not have to conform to the traditional 1-2-3 processing pattern. Usually a workflow will include a recognition step, but this is not compulsory.
Sample workflows Sample workflows are provided with OmniPage 17 to offer you typical work processes. Choose one in the Workflow drop-down list at the left side of the OmniPage Toolbox. Click the Workflow Assistant button see its steps and settings. in the Standard toolbar to Running workflows Here is how to run a sample workflow or one you have created: 1. If your workflow takes input from scanner, place your document in its ADF or its first page on the scanner bed. 2.
Document Ready button on the Toolbox. Any pages without zones will be auto-zoned. 8. The After Completion menu under Process / Workflows gives you three options to end a workflow. You can choose to close the document, close OmniPage, or shut down your computer. These settings are typically applied if the workflow runs unattended - if your workflow is so, remember to include a saving step. You can also run workflows from an OmniPage Agent icon on the Windows taskbar.
Workflow Assistant This allows you to create and modify workflows. The Job Wizard also uses this to create or modify workflows that jobs execute - see the next section. The Assistant offers one or more steps, each with a drop-down list. This left panel of the Workflow Assistant dialog box lets you build your workflow. . This shows the steps you have chosen. This drop-down list shows the possible steps at any given workflow position. Use this to add a new step to your workflow.
At any moment in the process, the Assistant drop-down menu offers all steps that are logically possible at that point. In OmniPage 17 Professional, additional steps are available: Extract Form Data and Mark Text. Creating workflows Select New Workflow... in the Workflow drop-down list, or from the Process menu. Or click the Workflow Assistant button in the Standard toolbar when no workflow is selected.
Modifying workflows Select the workflow you want to modify in the Workflow drop-down list and click the Workflow Assistant button in the standard toolbar. Or choose Workflows... in the Tools menu, select the desired workflow and click Modify... . The first panel of the Workflow Assistant appears with the workflow loaded. Click the icon in the workflow diagram that represents the step you want to modify. Click the downward pointing arrow under the icon to replace this step with another one.
workflow according to the job settings. Jobs are created in the Job Wizard. In OmniPage Professional 17 you have the following additional Batch Manager capabilities: • • • • • Setting job timing and recurrence Folder watching for incoming image files E-mail inbox watching for incoming attachments (Outlook and Lotus Notes) E-mail notification of job completion to specified recipients Driving workflows with barcodes.
Normal job: Set starting time and specify or create the Workflow to be run. If you select ‘Do not start now’ use the Activate button in the Batch Manager to start it. Job types available in OmniPage Professional 17 only: Barcode cover page job: This is a special type of folder watching job (see below). It monitors a folder for incoming barcode pages, then processes subsequently incoming images with the workflow identified by the barcode. For details, see Barcode processing later in this chapter.
From the next panel onwards, you can construct your job (except for barcode cover page jobs) as you normally do with Workflows. Set your starting point (Fresh Start or Existing Workflows) and proceed as described in the Workflows topic. The Options dialog box in the Batch Manager is in the Tools menu. Its General panel has an option Enable OmniPage Agent on system tray at system startup. By default it is on. It must remain selected for jobs to run at their scheduled time.
Managing and running jobs This is done with the Batch Manager. It has two panels. The left panel lists each job, its next run, status and history. The status is: Waiting: Scheduled but job start time is in the future. Running: Processing is currently underway. Watching: Watching is in progress but there is no processing. Inactive: Created with timing instruction: Do not start now; or any deactivated jobs. Expired: Scheduled job but start time is in the past.
Stop Job in the File menu stops a job with status Starting, Running, or Paused. Pause Job is available for jobs with status Running or Starting. To modify such a job’s timing instructions you must stop it. Resume Job lets the job continue from its state when it was paused. Delete Job in the Edit menu serves to delete the currently selected job. Only Inactive jobs can be deleted. Rename Job serves to modify the name of any job. Use the Edit menu to send a copy of a job’s status report to Clipboard.
Watched folders In OmniPage Professional 17, you can specify watched folders and e-mail inboxes (Outlook and Lotus Notes) as job input. These allow processing to be started automatically whenever image files are placed in pre-defined folders or arrive into inboxes as e-mail attachments. This is useful to have sets of files with predictable content arriving from remote locations processed automatically on arrival, even if no-one is in attendance.
repeatedly, once for each type. Add a checkmark to watch subfolders of the selected folder as well. When you reach the next panel of the Job Wizard, you set the timing instructions: a starting time and an end time for the watching to occur. You can specify recurrences, for instance to have the folder(s) watched only during your lunch hour (Start 12.15, End 13.
Barcode processing In OmniPage Professional 17, you can run workflows (sets of steps and their settings) using barcode cover pages that define which workflow should run. A barcode cover page identifies a workflow (with workflow identifier, workflow name and workflow steps) and contains information on workflow creation (name of the creator, date of creation, etc.). Note that barcode processing cannot be recurrent.
3. Select “Barcode cover page workflow” as Scanner button default action on the Scanner tab of Options. You can also set it to Prompt for workflow. In this case, a dialog box appears with the available choices: Scanning, Barcode cover page workflow, and all scanning workflows. All available pages will be processed by the specified workflow, or until a new barcode page is encountered. The result will be saved as specified by the workflow. For image input you must create a barcode cover page job.
4. The workflow will be completed at the specified end time of the job, or each time a new barcode cover page is detected. You can copy the barcode cover page image and the image files into the watched barcode folder yourself, or direct others to do this. You can also place just a barcode cover page image file in the watched folder, then have a network scanner make and send image files there. File-it Assistant The File-it Assistant lets you create scanning workflows for repeated document conversion tasks.
Use the workflow: 1. Place the printed barcode cover page on top of a document in your scanner. 2. Push the OmniPage-associated scanner button. The document will be converted using steps and settings from the referenced workflow and sent to the location you defined. It is possible to use barcode cover pages stored as image files to drive jobs from watched folders. Such jobs permit interactive steps like manual zoning and proofing that are not available via the File-it Assistant.
Technical information This chapter provides troubleshooting and other technical information about using OmniPage 17. Please also read the Readme file and other help topics, or visit the Nuance web pages. Troubleshooting Although OmniPage is designed to be easy to use, problems sometimes occur. Many of the error messages contain selfexplanatory descriptions of what to do – check connections, close other applications to free up memory, and so on.
• • • • Use the software that came with your scanner to verify that the scanner works properly before using it with OmniPage. Make sure you have the correct drivers for your scanner, printer, and video card. Visit Nuance’s web page through the Help menu and consult its scanner section for more information. Defragment your hard disk. See Windows online Help for more information.
• If OmniPage runs in safe mode, then a device driver on your system may be interfering with OmniPage operation. Troubleshoot the problem by restarting Windows in Step-by-Step Confirmation mode. See Windows online Help for more information. Text does not get recognized properly Try these solutions if any part of the original document is not converted to text properly during OCR: • Look at the page image and ensure that all text areas are enclosed by text zones.
• • • • • • Make sure the correct document languages are selected in the OCR panel of the Options dialog box. Only languages included in the document should be selected. In particular, setting an Asian language for non-Asian texts (and vice versa) is likely to produce unusable results. Recognition results in Japanese, Korean and Chinese can be viewed and saved only if your system has East Asian language support. See “Asian language recognition” on page 61.
• Ask senders to transmit files directly to your computer via fax modem if you both have one. You can save fax images as image files and then load them into OmniPage. See “Input from image files” in the Processing documents Chapter. System or performance problems during OCR Try these solutions if a crash occurs during OCR or if processing takes a very long time: • Check image quality. Consult your scanner documentation on ways to improve the quality of scanned images.
OmniPage Documents PDF (Normal), Edited, with image on text, with image substitutes RTF Word 6.
Index Click a page number to jump to the referenced item. (E) = Image Enhancement tool (F) = Form handling tool Agent to start OmniPage 17, 87 Alphanumeric zones 50 Amazon Kindle 82 Area definition for SET tools 44 Arial Unicode MS 63 Asian language recognition 61 Asian texts, vertical 49 Assigning OmniPage to scanner buttons 39 Symbols .NET Framework 3.
C Changing part of a page 68 reading order 67 views 21, 26 Changing workflows 90 Character attributes 66 Character Map 59 Characters, suspect 56 Checkbox tool (F) 72 Checking OCR results 58 Chinese 61 Circle text tool (F) 72 Classic View 21, 22 Clipboard sending recognition results 74 Color images 75 markers 58 scanning 38 Color dropout for forms 45 Coloring image areas 46 Comb tool (F) 72 Comparing recognized words with originals 58 Composition of workflows 85 Contrast 38, 104 Contrast / Brightness (E) 45
sending to Clipboard 74 with varied layout 40 Document-to-document conversion DOCX Word 2007 support 12 Dot removal from images 45 Double-sided documents 39 Drawing zones in Direct OCR 34 Dropout color (E) 45 Dropping graphics from export 76 Dual screens 24 Duplex scanners 39 Dynamic verifier 58 E 40 East Asian language support 12, 61 Easy Loader 21, 23, 36 Easy Loader in Quick View 26, 37 Editing character attributes 66 form objects 73 graphics 66 in True Page 67 on-the-fly 68 paragraph attributes 66 PD
Finding non-dictionary words 57 suspect words 57 Finishing proofing in a workflow 86 workflows 89 zoning in a workflow 86 Flexible View 21, 23 Flipping images 45 Floating panels 21 Flowing Page 77 Form Arrangement toolbar 72 Form data, extracting 73 Form drawing toolbar 71 Form objects, editing 73 Form processing with dropout 45 Form zone 51 Formatted Text 57 Formatted Text level 77 Formatting levels 57, 76 Formatted Text 57 Plain Text 56 True Page 57 Formatting toolbar 21 Frames 67, 77, 105 Framework 3.
editing 66 flipping 45 grayscale 75 quality 39 resolution 46, 75, 104 rotating 45 saving 75 size requirements 35 substitutes in PDF 80 iManage 84 Improving accuracy 38, 64, Increasing memory 104 Input from digital camera 35 from image files 34 from PDF files 34 from scanners 38 via Easy Loader 36 Installing OmniPage 12 scanners 14 IntelliTrain 64, 105 Interactive job steps 92 Italic text 66 J Japanese 61 Jobs disabling 93 error messages 94, 95 managing 94, 95 modifying 93 notification of completion 91 pag
Manual 3D deskewing 46 Manual deskewing 46 Manual training 64 Manual zoning 48 Marked words in Editor 56 Markers 56, 58 Marking text 68 Maximising workspace 24 Maximum image sizes 35 Medical dictionaries 57, 61 Memory requirements 11, 104 Microsoft .
Opening image files 34 Operating system requirements 12 Optimized PDF for web display 81 Optimizing brightness 38 Options dialog box 28 Options for proofing 57 Options for saving 78 Order of page elements 67 Original image 42 Original image saving 75 Outlook 91, 92, 97 Overview of processing steps 21 P Page Image panel 21 Page limit for jobs 93 Page Ready button 86 Pages deskewing 46 multi-page image files 75 navigation 22 sending as mail 81 sending to Clipboard 74 Panels 21 PaperPort 19, 29 Paragraph edi
Quick Convert View with Easy Loader jobs without prompts workflows 86 26, 37 R 92 S Reading order 67 Reading text aloud with RealSpeak 69 Recognition accuracy 39, 63, 104 languages 61, 105 problems with faxes 105 saving results 76 speeding up 105 Rectangle tool (F) 71 Recurrent jobs 92, 97 Redacting text 68 Redocking panels 21 Reducing image area 45 Registration 17 Reinstalling OmniPage 18 Removing image edges 45 Removing noise from images 45 Removing workflow steps 90 Removing zone templates 54 Repea
Selection tool (F) 71 Send to Back tool (F) 72 Sending pages by mail 81 to Clipboard 74 to Kindle 82 SET tools 43 defining an area 44 Setting up a scanner 14 Setting up Direct OCR 33 Settings Acquire Text 33 for Direct OCR 33 Options dialog box 28 zone types 53 Settings for workflows 88 SharePoint 84 Simplified UI 25 Single-column pages with tables 41 Sizes for input images 35 Skipping interactive job steps 92 Slow recognition 106 Smart folders 96, 97 Solutions for poor performance 102 Specialized dictionar
Toolbars 27 Training 63 automatic (IntelliTrain) 64 manual 64 training files 65 Troubleshooting 102 True Page 57 True Page editing 67 True Page export 77 TWAIN scanner drivers 15 Types of zones 49 Classic 22 Custom 26 Flexible 23 Quick Convert 25 resetting 23 using Window menu W U Underlined text 66 Undocking panels 21 Ungrouping elements 67 Uninstalling the software 18 Unloading training files 65 user dictionaries 60 zone templates 54 URLs 66 User dictionaries 57, 60 User interaction in workflows 86 Us
Workspace management 24 X XPS 12, 81, 106 XPSX Excel 2007 12 Z Zones 50 adding to 52 alphanumeric 50 changing types 51 deleting templates 53 graphic 51 ignore 51 in Direct OCR 34 irregular 52 joining 52 manual 48, 104, 106 modifying templates 53 numeric 50 process 51 prohibited shapes 52 properties 49 replacing templates 54 saving templates 54 table 51, 53 templates 42, 53, 104 types 49, 104 unloading templates 55 vertical Asian text 62 working with 51 Zoning in a workflow 86 Zoning on-the-fly 68 Zoom
THIRD PARTY LICENSES/NOTICES The word verification, spelling and hyphenation portions of this product are based in part on Proximity Linguistic Technology. The Proximity Hyphenation System © Copyright 1988. All Rights Reserved. Franklin Electronic Publishers, Inc. The Proximity/Merriam-Webster American English Linguibases. © Copyright 1982, 1983, 1987, 1988 Merriam-Webster Inc. © Copyright 1982, 1983, 1987, 1988 Franklin Electronic Publishers, Inc.
The Independent JPEG Group's software, copyright © 1991-1995, Thomas G. Lane. Portions of this software are copyright © 2006 The FreeType Project . All rights reserved. FreeType 2.3.1, Turner, Wilhelm, Lemberg. Zlib copyright © 1995-1998 Jean-loup Gailly and Mark Adler. This product was developed using Kakadu software. Export Options dialog controls from Allan Nielsen, Supergrid control, copyright © 1999. This product includes software developed by the OpenSSL project