LEGAL NOTICES Copyright © 2002 by ScanSoft, Inc. All rights reserved. No part of this publication may be transmitted, transcribed, reproduced, stored in any retrieval system or translated into any language or computer language in any form or by any means, mechanical, electronic, magnetic, optical, chemical, manual, or otherwise, without prior written consent from the Legal Department at ScanSoft, Inc., 9 Centennial Drive, Peabody, Massachusetts 01960, USA.
C O N T E N T S WELCOME VII Using this Guide viii Getting online help 1 2 ix Online HTML Help ix Context-Sensitive Help ix Tech Notes x Glossary x INSTALLATION AND SETUP 11 System requirements 12 Installing TextBridge Pro 13 Setting up your scanner with TextBridge Pro 14 How to start the program 16 Registering your software 17 New features in TextBridge Pro 11 18 INTRODUCTION 19 What is optical character recognition 20 TextBridge Pro’s OCR capabilities 20 Documents in T
The Formatting toolbar 24 The TextBridge Toolbox 25 Managing documents Thumbnail view 26 Detail view 27 Customizing columns in Detail view 28 Deleting pages from a document 28 Printing a document 28 Closing a document 29 TextBridge Documents Why save to TXD How to save to TXD Settings 3 TUTORIAL: PROCESSING DOCUMENTS Quick Start Guide iv CONTENTS 26 29 29 30 30 33 34 Loading and recognizing sample image files 34 Scanning and recognizing a single page 34 Processing documents using
4 5 6 Input from image files 48 Input from scanner 49 Scanning with an ADF 50 Scanning long documents without an ADF 51 Describing the layout of the document 51 Manual zoning 53 Working with zones 53 Zone properties 54 Table grids in the image 56 Using zone templates 57 PROOFING AND EDITING 59 Proofreading OCR results 60 Checking recognized text against original 62 User dictionaries 63 The editor display and views 63 Text and image editing 64 Page outline 66 SAVING AND E
Testing TextBridge Pro 77 Low memory problems 78 Low disk space problems 78 Supported file types 79 File types for opening and saving images 79 File types for saving recognition results 80 OCR problems 81 Text does not get recognized properly 81 Problems with fax recognition 82 System or performance problems during OCR 82 Uninstalling the software vi CONTENTS 83
Welcome Welcome to TextBridge Pro 11TM, and thank you for using our software! The following documentation has been provided to help you get started and give you an overview of the program. This User’s Guide This guide introduces you to using TextBridge Pro 11. It includes installation and setup instructions, a description of the program’s commands and working areas, task-oriented instructions, ways to customize and control processing, and technical information.
USING THIS GUIDE This guide is written with the assumption that you know how to work in the Microsoft Windows environment. Please refer to your Windows documentation if you have questions about how to use dialog boxes, menu commands, scroll bars, drag and drop functionality, shortcut menus, and so on. We also assume you are familiar with your scanner and its supporting software, and that the scanner is installed and working correctly before it is setup with TextBridge Pro 11.
GETTING ONLINE HELP In addition to using this guide, you can use TextBridge Pro’s online Help to learn about features, settings, and procedures. Online Help is available after you install TextBridge Pro. Online HTML Help Open TextBridge Pro’s online Help at its top level by choosing TextBridge Pro Help Topics at the top of the Help menu. This allows you to see topics arranged in a Table of Contents, search an alphabetical list of keywords or make full-text searches through the topics.
Tech Notes ScanSoft’s web site at www.scansoft.com contains Tech Notes on commonly reported issues using TextBridge Pro 11. Web pages may also offer assistance on the installation process and troubleshooting. Glossary This guide does not include a glossary. The online Help has a comprehensive glossary, with its own alphabetical index and a table of contents. Please consult it if you want to find the meaning of a term used in this guide or in the program.
1 Installation and setup This chapter provides information on installing and starting TextBridge Pro 11.
SYSTEM REQUIREMENTS You need the following minimum system requirements to install and run TextBridge Pro 11: u A computer with a Pentium or higher processor u Microsoft Windows 95, Windows 98, Windows MeMe, Windows 2000, or Windows NT 4.0 u 32MB of memory (RAM), 64MB recommended u 75MB of free hard disk space for the application files plus 10MB working space during installation u 9MB for Microsoft Installer (MSI) if not present and 44MB for Internet Explorer if not present.
INSTALLING TEXTBRIDGE PRO TextBridge Pro 11’s installation program takes you through installation with instructions on every screen. Before installing TextBridge Pro: u Make sure your scanner is connected, turned on, and compatible with your system. u Close all other applications, especially anti-virus programs. u Log into your computer with administrator privileges if you are installing on Windows 2000 or Windows NT. t To install TextBridge Pro: 1. Insert TextBridge Pro’s CD-ROM in the CD-ROM drive.
SETTING UP YOUR SCANNER WITH TEXTBRIDGE PRO All files needed for scanner setup and support are copied automatically during the program’s installation. Before using TextBridge Pro 11 for scanning, your scanner should be correctly installed and tested for correct functionality. Scanner installation and setup are done through the Scanner Wizard. You can start this yourself, as described below. Otherwise, the Scanner Wizard appears when you first try to perform scanning from TextBridge Pro 11.
u The page sizes that the Scanner Wizard believes that your scanner supports are listed in the window. To make any changes to the page sizes, click on Advanced, make the changes and then click on Next. u Insert a page with text but no pictures into your scanner. Click on Next to begin a scan in black and white mode. u If necessary, click on Inverse Image… or Missing Image… and make the appropriate selections. u Once the image appears correctly in the window, click on Next.
HOW TO START THE PROGRAM To start TextBridge Pro 11 do one of the following: u Click Start in the Windows taskbar and choose ProgramsÉScanSoft TextBridge Pro 11.0ÉTextBridge Pro 11.0. u Double-click the TextBridge Pro icon in the program’s installation folder or on the Windows desktop if you placed it there. u Double-click a TextBridge Document (TXD) icon or file name; the clicked document is loaded into the program. See TextBridge Documents in chapter 2.
When the form is filled, and you click Send the program will search an Internet connection to immediately perform the registration online. If you did not register the software during installation, you will be periodically invited to register later. You can go to www.scansoft.com to register online. Click on Support and from the main support screen choose Register on the left-hand column. For a statement on the use of your registration data, please see ScanSoft’s Privacy Policy.
u OCR Proofreader – Jump through the suspect words in a document and handle them one after the other in a dialog box. Previously, suspect words were only marked. See page 60. u OCR verifier – This lets you compare any recognized word with its appearance in the original image as you edit and reformat the document. See page 62. u TextBridge Documents – These TXDs open the way to deferred or distributed processing. They store all images along with recognition results.
2 Introduction You probably use your computer for business correspondence, preparing reports, handling data and an ever-increasing number of other uses. The challenge is that, in spite of the digital revolution, certain sources of information still circulate in printed, paper form and cannot be used immediately in a computer. For example, if you want to incorporate information from a magazine article in a report you are preparing, you somehow have to get the text from the article into your computer.
WHAT IS OPTICAL CHARACTER RECOGNITION Optical character recognition is the process of extracting text from an image. This image can result from scanning a paper document or opening an electronic image file. Images do not have editable text characters; they have many tiny dots (pixels) that together form character shapes. These present a picture of the text on a page. During OCR, TextBridge Pro 11 analyzes the character shapes in an image and defines solutions to produce editable text.
Documents in TextBridge Pro TextBridge Pro 11 handles documents one at a time. When you acquire your first image (from scanner or from file) a new document is started. Further acquired images are added to the same document, until you save and close it. A document in TextBridge Pro consists of one image for each document page. After you perform OCR, the document will also contain recognized text, displayed in the Text Editor, possibly along with graphics and tables.
THE TEXTBRIDGE DESKTOP The TextBridge desktop has a title bar and a menu bar along the top and a status bar along the bottom. It has three main working areas, separated by splitters: the Document Manager, the Original Image area and the Text Editor. The Document Manager has two tabbed panels: Thumbnail view and Detail view. The Original Image area has an Image toolbar and the Text Editor has a Formatting toolbar. Formatting toolbar Standard toolbar TextBridge Toolbox The current page has a pale border.
The TextBridge Toolbox lets you control processing. It can have three states, depending which of the three tab buttons on the left is clicked. In the picture, we display its appearance for Manual OCR. We show the program with a three-page document. Page one is the current page, which has been recognized and proofed. Page two has been recognized but not proofed yet. Page three has been acquired and manually zoned, but not recognized yet. The icons at the bottom right of the thumbnail images show page status.
The Image toolbar The Image toolbar contains buttons that allow you to zoom in or out on the current image or to rotate it. They also allow you work with zones and table dividers. See chapter 3, Manual zoning and Table grids in the image. Here we summarize the purpose of the buttons. The Image toolbar can be floated (that is, undocked and moved anywhere on the desktop). It can be docked to any edge of the Original Image area. Draw rectangular zones. Draw irregular zones.
The TextBridge Toolbox This Toolbox lets you drive the processing. By default it is located along the top of the TextBridge desktop, just above the working areas. It can be floated and also be docked along the bottom of the desktop. It has three tabs on the left: AutoOCR™, Manual OCR and OCR Wizard. Click one to see its controls in the Toolbox. The picture at the beginning of this section showed the TextBridge desktop with the Manual OCR toolbar. The AutoOCR toolbar looks like this.
MANAGING DOCUMENTS The Document Manager is situated on the left of the TextBridge desktop. It has two tabbed panels: Thumbnail view and Detail view. Click a tab to see its view. Both views summarize the pages in the document and are synchronized: the current and selected pages remain the same when you switch views. Our pictures show the two views with the same four-page document. Pages 1 and 2 are selected and page 4 is the current page, that is, the one shown in the Original Image area.
Detail view This facility is new to TextBridge Pro 11. It provides an overview of your document with a table. Each row represents one page. Columns present statistical or status information for each page, and (where appropriate) document totals. The picture below shows the default columns on the left and four columns which a user has specified. Move the cursor onto the page’s status icon to see a thumbnail of the page. This shows the number of zones of each type on the page.
Customizing columns in Detail view You can specify which columns of information you want to see in Detail view. Click Customize Details... in the View menu for the following dialog box: This item is highlighted. Click a checkbox to select the item. Image sizes are expressed in pixels. Highlight an item and use these arrows to change the order of columns. Define a width for the highlighted item. Define which columns should appear, their widths, and column order.
Closing a document Choose Close in the File menu to close a document. You are prompted to save your document if you have not saved it or you have modified it since the last save. See the next section on saving the document as a TextBridge Document (*.TXD). TEXTBRIDGE DOCUMENTS The TextBridge Document is the program’s proprietary file type; it has the extension .TXD. It is one of the file types offered when saving a document to file.
You want to build up an archive of recognized documents whose original images remain accessible. The recognized texts allow searching by keywords and other document retrieval techniques. Note Recognition results should be saved from TXD files before installing any TextBridge Pro upgrade. These files may not be upwards compatible to newer TXD file formats, or possibly only the images will be retained when the files are upgraded.
multi-page documents, with or without an Automatic Document Feeder (ADF). You can change scanner setup settings or install a new scanner or change the default scanner. Direct OCR This feature provides OCR services directly from your favorite word processor or similar application. Use this panel to register and unregister applications for Direct OCR and to enable or disable this service. You can also specify automatic or manual zoning and whether proofreading is desired or not.
Text Editor Use this to show or hide some features in the Text Editor, to define the unit of measurement to be used and to turn word wrapping on or off. Note Some settings have an effect only on future recognition. Examples are the recognition languages, a user dictionary and scanner brightness. These settings should be correctly adjusted before you start processing. To have changes in these settings applied to already recognized pages, you will have to rerecognize them.
3 Tutorial: Processing documents This chapter describes different ways you can process a document and also provides information on key parts of this processing.
QUICK START GUIDE This topic takes you step-by-step through the basic OCR process. Loading and recognizing sample image files You will find sample image files in the program folder, both single-page and multi-page files. First try reading these files using the procedure presented below, except for the references to a scanner. See Input from image files for more information on acquiring the images.
What you do What happens 1. Set up your scanner using the Scanner Wizard, if this is not already done. Configures TextBridge Pro 11 to work with your scanner. 2. Select Start Programs ScanSoft TextBridge Pro 11.0 TextBridge Pro 11.0 Opens TextBridge Pro 11 on your computer. 3. Place the document correctly in your scanner. 4. Check the three tab buttons to the left of the TextBridge Toolbox. The AutoOCR button should be selected. If not, click on it.
Here is an overview of the processing methods you can use. You will find step-by-step guidance for each of them in the following pages. Using the OCR Wizard The OCR Wizard guides you through the selection of settings and commands by asking you questions. It then launches automatic processing. This is a good way to get started if you are new to TextBridge Pro. Automatically The fastest and easiest way to process documents is to let TextBridge Pro do it automatically for you.
PROCESSING DOCUMENTS USING THE OCR WIZARD The OCR Wizard takes you through six settings panels, guiding you to make settings for your document and then launching automatic processing. Context-sensitive help is available for all Wizard panels. The OCR Wizard can run only when there is no document open in TextBridge Pro. Click the OCR Wizard tab in the TextBridge Toolbox and click the Wizard button to see the first wizard screen: 1. The first panel lets you define your document source: scanner or image file.
3. The third panel (shown below) lets you define recognition languages and decide OCR method. Languages with dictionary support have the icon . 4. The fourth panel lets you define the formatting level to be applied to your document for display and export. See chapter 4, The editor display and views, for more information. 5. The fifth panel asks if you want to proofread the text before export. If you choose Yes you can also edit the text before saving. 6.
manually or change other settings and then use manual processing to rerecognize single pages from the document. You can add pages with automatic or manual processing. Note The Wizard panels present settings as they were last set in the program. Also, TextBridge Pro will remember the settings you make in the OCR Wizard panels and apply them to future automatic or manual processing, until you change them.
PROCESSING DOCUMENTS AUTOMATICALLY Automatic processing provides an efficient way of handling documents, especially larger ones. First you select all settings needed, then you can use the AutoOCR™ toolbar in the TextBridge Toolbox to process a new document from start to finish or to restart and finish processing on an open document. 1. Click the AutoOCR tab in the TextBridge Toolbox to display the AutoOCR toolbar. 2. Select the desired Get Page command in the drop-down list.
6. Click Start or choose Start in the Process menu. Each page of the document is processed and finished one after the other. The program may perform tasks simultaneously, for instance it may start loading and recognizing a new page as you proofread the previous page. Command buttons Start: This lets you begin automatic processing on a new document. Stop: This lets you interrupt automatic processing. You may do this if you find that some settings need to be changed. Then the Start button changes to Finish.
PROCESSING DOCUMENTS MANUALLY Manual processing gives you more precise control over the way your pages are handled. You can process the document page-by-page with different settings for each page. The program also stops between each step: acquiring images, performing recognition, exporting. This lets you, for instance, draw zones manually on each page. You start each step in the process by clicking the buttons on the Manual OCR toolbar. 1.
6. Select a value for the Perform OCR button. You describe the layout of the incoming pages. This value has an influence if auto-zoning runs on any pages. You can also select a template to have its zones placed on the current page. For more detail see the sections Describing the layout of the document and Using zone templates. 7. Click the Perform OCR button to have the current page recognized.
PROCESSING A DOCUMENT AUTOMATICALLY AND FINISHING IT MANUALLY When you have a large document with only a few pages needing special attention, you do not have to manually process the whole document. You can process it automatically and view results in the Text Editor. You can determine which pages are in order, and which need different settings or some manual zoning. Then you can switch to manual processing to adjust settings and zones and rerecognize just those pages. 1.
PROCESSING FROM OTHER APPLICATIONS You can use the Direct OCRTM feature to call on the recognition services of TextBridge Pro while you work in your usual word-processor or other application. First you must establish the direct connection with the application. Then, two items in its File Menu open the door to OCR facilities. How to set up Direct OCR 1. Start the application you want connected to TextBridge Pro.
6. If proofing was specified, this follows recognition. Then the recognized text is placed at the cursor position in your application, with the formatting level specified by Acquire Text Settings... . Note If TextBridge Pro is running when Direct OCR is called from a target application, a second instance of TextBridge Pro is launched. How to use TextBridge Pro 11 with your PaperPort software PaperPort® is a paper management software product from ScanSoft.
PROCESSING DOCUMENTS WITH SCHEDULE OCR You can schedule OCR jobs to be performed automatically at any time within the following 24 hours. Each job handles one document. The document pages can come from a scanner with an ADF or from image files. You do not have to be present at your computer at job start time, nor does TextBridge Pro have to be running. It does not matter if your computer is turned off after the job is set up, so long as it is running at job start time.
DEFINING THE SOURCE OF PAGE IMAGES There are two possible image sources: from image files and from a scanner. There are two main types of scanners: flatbed or sheetfed. A scanner may have a built-in or added Automatic Document Feeder (ADF), which makes it easier to scan multi-page documents. The images from scanned documents can be input directly into TextBridge Pro or may be saved with the scanner’s own software to an image file, which TextBridge Pro can later open.
Normally the Add button places each file at the bottom of the file list. To place a file at a different location, highlight a file in the list. The new file will be added immediately below the lowest highlighted file. Input from scanner You must have a functioning, supported scanner correctly installed with TextBridge Pro. See chapter 1, Setting up your scanner with TextBridge Pro for more information. You have a choice of scanning modes.
Brightness and contrast Good brightness and contrast settings play an important role in OCR accuracy. Set these in the Scanner panel of the Options dialog box. The diagram illustrates an optimum brightness setting. After loading an image, check its appearance. If characters are thick and touching, lighten the brightness. If characters are thin and broken, darken it. Then rescan the page.
You can scan double-sided documents with an ADF. A duplex scanner will manage this automatically. For non-duplex scanners, select ‘Scan double-sided pages’ in the Scanner panel of the Options dialog box. Then you can scan the document in just a few passes, with even pages grouped together and odd pages also grouped. TextBridge Pro will merge the pages for you. Scanning long documents without an ADF You can scan multi-page documents efficiently from a flatbed scanner, even without an ADF.
Single column, no table Choose this setting if your pages contain only one column of text and no table. Business letters or pages from a book are normally like this. Choose it also for a page with words or numbers arranged in columns if you do not want these placed in a table or decolumnized or treated as separate columns. Graphics may be detected.
MANUAL ZONING Zones define areas on the page to be processed. Zones are rectangular or irregular (with sides formed by vertical and horizontal lines). Zones cannot overlap. They have a zone number in the top left corner and a zone type icon top right. Click in a zone to select it. Use Shift+clicks for a multiple selection. Current and selected zones are shaded. Click outside a zone to remove the selection. Zones appear on an original image in the following cases: u The page has been recognized.
Subtract from zone Click this to subtract irregular parts from an existing zone or split a zone into smaller ones. You cannot move or resize existing zones when this tool is active. You cannot use this with a table type zone. Reorder zones Click this for the zone reordering tool. Then click in zones in the desired reading order. For your order to be respected, choose ‘Use current zones only’ and avoid having multiple-column or auto-detect zones types on the page.
Table zone Use this to have the zone contents treated as a table. Table grids can be automatically detected, or placed manually as described in the next section. Table zones must be rectangular. The Text Editor displays the table in an editable grid. You can choose whether to export tables in grids or in columns separated by tabs. Auto-detect zone Use this to let the program decide the zone type. To do this, auto-zoning runs, which may also result in changed zone order on the page.
TABLE GRIDS IN THE IMAGE After automatic processing you may see table zones placed on a page. They are denoted with a table zone icon in the top right corner of the zone. To change a zone to or from a table zone, use its shortcut menu. You can also draw a table type zone. If there is already a table zone on the page, select it, then draw the new rectangular zone. It will inherit the table type. Otherwise draw a rectangular zone and use its shortcut menu to change it to a table type.
Remove/replace all dividers Click this tool and click inside a table zone. Its dividers will all disappear. Click again to have dividers automatically (re)detected. Divider placement usually occurs during recognition; clicking twice with this tool lets you see and edit the dividers before recognition. USING ZONE TEMPLATES A template is a set of zones, their properties and reading order, stored in a file. A zone template file can be loaded to have template zones used during recognition.
How to unload a template Select a non-template setting for layout description in the Perform OCR drop-down list. The template zones are not removed from the current or existing pages, but template zones will no longer be used for future processing. You can also open the Zone Template Files dialog box, select [none] and click the Set As Current button. In this case, the layout description setting returns to Automatic.
4 3URRILQJ DQG HGLWLQJ Recognition results are placed in the Text Editor.
The Text Editor offers four views for displaying its pages. You can switch freely from one view to another. These provide different levels of formatting. The views are: No Formatting view This displays plain decolumnized text in a single font and font size. Retain Fonts and Paragraphs view This displays decolumnized text with font and paragraph styling. True Page view This view tries to conserve as much of the formatting of the original document as possible. Character and paragraph styling is retained.
This is what TextBridge Pro thought the word was. This tells why the word is suspected. The image of the suspect word is highlighted. This window shows the relevant part of the original image. Click inside it to enlarge or reduce the display. Drag a corner or the bottom of the dialog box to resize it. 3. If the recognized word is correct, click Ignore or Ignore All to move to the next suspect word. Click Add to add it to the current user dictionary and move to the next suspect word. 4.
CHECKING RECOGNIZED TEXT AGAINST ORIGINAL After performing OCR, you can compare any part of the recognized text against the corresponding part of the original image, to verify that the text was recognized correctly. Work as follows: 1. Double-click any word in the Text Editor or select a word and choose Verify Text in the Tools menu. The Verify Text window opens and shows a picture of the original word and its surrounding area. Modify the word in the Text Editor as necessary. 2.
USER DICTIONARIES The program has built-in dictionaries for many languages. These assist during recognition and may offer suggestions during proofing. They can be supplemented by user dictionaries. You can save any number of user dictionaries, but only one can be loaded at a time. Your user dictionaries from Microsoft Word are also available; a dictionary called Custom is the default user dictionary for Microsoft Word.
measurement for the program and a word wrap setting for use in all Text Editor views except No Formatting view. Here are the main differences between the views: No Formatting view This displays plain decolumnized left-aligned text in a single font and font size, with the same line breaks as in the original document. Most formatting buttons and dialog boxes are disabled. Rulers are not displayed. You may find this view convenient for verifying and editing the text.
Formatting toolbar or the Font dialog box from the Format menu. The latter also offers subscripts, superscripts and colored text or backgrounds. In No Formatting view you can use the Formatting toolbar to specify one font type and size to be applied to the whole document. This is not transferred to other views; their previous settings are restored. Open the Font Matching dialog box from the OCR panel of the Options dialog box to specify which fonts to use for texts entering the Text Editor.
of text in table cells with the alignment buttons in the Formatting toolbar and the tab controls in the ruler. When saving the document to file, you can choose whether to have the tables exported in grids or as tab separated columns. PAGE OUTLINE The Page outline window lets you change the order of areas on a page or of paragraphs inside areas. It also lets you define how text should flow if you export with Retain Flowing Columns view. Open the page outline window from the View menu.
5 Saving and exporting Once you have acquired at least one image for a document, you can export the image(s) to file. Once you have recognized at least one page, you can export recognition results to a target application by: 1. Saving to file 2. Copying a document to the Clipboard 3. Sending a document as a mail attachment. The document remains in TextBridge Pro after export.
PREPARING RECOGNITION RESULTS FOR EXPORT Text is exported to file, Clipboard or mail with the formatting level defined by the view set in the Text Editor at export time, if that is possible. However, some export file types and target applications cannot support all formatting elements. You may be warned if there is a mismatch and offered the highest permissible view. You can accept that, or cancel export, set a different view and restart the export.
SAVING TO FILE You can save recognized pages and original images to disk in a wide variety of file types. For tables of file types, see chapter 6, File types for opening and saving images and File types for saving recognition results. Saving original images 1. Choose Save Image... in the File menu. In the dialog box that appears, select a folder location and a file type for your images. Type in a file name. 2. Select to save the current image only or all images in the document.
Saving recognition results 1. Choose Save As... in the File menu, or click the Export Results button in the Manual OCR toolbar with Save as File selected in the drop-down list. 2. The Save As dialog box appears, as shown in its expanded form. Click Advanced to open the lower panel and Basic to close it. Select this to automatically open the saved file in its target application. Choose from: Create one file for all pages Create one file per page Create a new file at each blank page.
Note Graphics and formatting are saved in the document only if the selected file type supports them. The formatting level for export is the Editor view set at saving time. You will be warned if the formatting level is not supported by the export file type. Note If more than one export file is created, TextBridge Pro will append a numerical suffix to your file name to create unique file names.
If you first save the document as a TextBridge Document (for instance as memo.txd), then modify it and later save it to a text file (for instance as memo.txt), then modify it again and click Save, the recent changes are saved to the memo.txt file, not to the TXD. When you close the document or exit the program, you will be prompted to save the document if it has not been saved as a TextBridge Document, or there are changes since the last TXD save.
SENDING A DOCUMENT AS A MAIL ATTACHMENT You can send recognition results as one or more files attached to a mail message if you have installed a MAPI-compliant mail application, such as Microsoft Outlook. t To send a document by e-mail • With automatic processing, select Send as Mail as the command in the Export Results drop-down list on the AutoOCR toolbar. The Send Mail dialog box appears as soon as the last available page in the document is recognized or proofed.
3. Your mail application appears with the attachment(s) in a new empty message. Attachments take the name used for the last save of the document in TextBridge Pro, or ‘Untitled from TextBridge’. The suitable file extension is added, and numerical suffixes for multiple attachments. 4. Address your mail message, add message text as desired and click the Send button.
6 Technical information This chapter provides troubleshooting and other technical information about using TextBridge Pro 11. Please also read the online Readme file and other help topics, or visit the ScanSoft web pages. The Scanner Information web page contains detailed and regularly updated information about scanner setup and support. The Readme file contains last-minute information relating to TextBridge Pro. Access to the Readme file and to ScanSoft’s web pages is provided in the Help menu.
TROUBLESHOOTING Although TextBridge Pro is designed to be easy to use, problems sometimes occur. Many of the error messages contain self-explanatory descriptions of what to do – check connections, close other applications to free up memory, and so on. Sometimes that is all the troubleshooting help you need. Please see your Windows documentation for information on optimizing your system and application performance.
Testing TextBridge Pro Restarting Windows 95, 98, 2000 or Me in safe mode or Windows NT in VGA mode allows you to test TextBridge Pro on a simplified system. This is recommended when you cannot resolve crashing problems or if TextBridge Pro has stopped running altogether. See Windows online Help for more information. Note Your scanner will not run with TextBridge Pro in safe mode or VGA mode, so do not test scanner problems in this configuration.
5. Launch TextBridge Pro and try performing OCR on an image. Use a known image file such as one of the supplied sample files. Note You can also run TextBridge Pro 11 from a command line in its own safe mode. Choose Start É Run, browse for the file TextBridge.exe and add the command line option /safe. This starts the program, but ignores previously stored settings and does not try to recover a document from an abnormal termination.
u Remove Windows applications that you do not use. u Defragment your hard disk. See Windows online Help for instructions. u Clear the cache for your web browser and limit its size. SUPPORTED FILE TYPES The program supports a wide range of file types, as detailed below. File types for opening and saving images File type Extension Multipage Open / Save B/W, Grayscale, Color BMP, Bitmap *.bmp No Open and Save All DCX *.dcx Yes Open and Save All JPEG *.
File types for saving recognition results File type Extension Format levels (Text Editor views) Supports graphics ASCII text 1 *.txt/.csv No Formatting view (NFV) No Excel (3.0 to 7.0, 97, 2000) *.xls NFV, RFP (Spreadsheet) Yes FrameMaker (5.5.3) *.mif All Yes Freelance Graphics *.txt No Formatting view (NFV) No Harvard Graphics *.txt No Formatting view (NFV) No *.htm All Yes 2 PowerPoint 97 *.rtf All Yes Microsoft Publisher 98 *.rtf All Yes Word for Windows (6.
OCR PROBLEMS This section contains information and solutions for possible OCR problems. First we provide suggestions for improving recognition accuracy, second on getting good results from fax input and finally on system or performance problems arising during OCR. Text does not get recognized properly Try these solutions if any part of the original document is not converted to text properly during OCR: u Look at the original page image and ensure that all text areas are enclosed by text zones.
u Check the glass, mirrors, and lenses on your scanner for dust, smudges, or scratches. Clean if necessary. Note TextBridge Pro only recognizes machine printed-text characters such as typewritten or laser-printed text. It can handle dot-matrix characters, though accuracy may be lower on draft-quality texts. It cannot read handprint or handwriting. However, it can retain signatures or other handwritten text as a graphic.
u Restart Windows 95, 98 and Me and 2000 in safe mode, or Windows NT and in VGA mode and test TextBridge Pro by performing OCR on the included sample image files. See the section Testing TextBridge Pro. If you are performing multiple tasks at once, such as recognizing and printing, OCR may take longer. UNINSTALLING THE SOFTWARE Sometimes uninstalling and then reinstalling TextBridge Pro will solve a problem.
TECHNICAL INFORMATION
I A Accuracy improvement, 31, 49 influence of brightness, 50 influence of OCR method, 31 scanning mode influence, 49 Acquire Text menu item, 45 Acquire Text Settings, 45 Acquired page, 26 Acquiring images, 21, 42 Add Job Wizard, 47 Adding pages to a document, 41 to zones, 53 words to a user dictionary, 61 ADF, 31, 48, 50 Alignment of paragraphs, 24 Alphanumeric zone, 54 Area reordering, 66 ASCII text output, 80 Attachments to mail messages, 73 Auto-detect zone, 51, 55 Automatic Document Feeder (ADF), 30, 4
saving, 67 saving as you work, 30, 71 unfinished, 29 with varied layout, 51 Document Manager, 22, 26 Dot-matrix texts, 82 Double-sided documents, 51 Drawing zones, 46 Drivers for scanners, 14 Dropping graphics from export, 70 Duplex scanners, 51 E Editing a user dictionary, 63 character attributes, 64 graphics, 65 paragraph attributes, 65 recognized text, 24, 65 table dividers, 24, 56 table grids, 56 tables, 65 Effect of settings, 32 Export Results button, 40, 43 Exporting file types for, 80 graphics, 70 p
Low disk space problems, 78 Low memory problems, 78 M Mail as export target, 73 attachments, 73 Managing documents, 26 Manual processing, 25, 42 Manual zoning, 24, 42, 53 Marked words in Text Editor, 63 Markers, 61, 63 Matching editor view with file type, 68, 80 Memory requirements, 12, 78 Menu bar, 23 Minimum system requirements, 12 Modifying a zone template, 57 Moving between pages, 26 table dividers, 56 MS Outlook, 73 Multi-page image files, 48, 69, 79 Multiple column pages, 52 Multiple column zone, 54
documents in future sessions, 29 documents manually, 42 from other applications, 45 incomplete automatic processing, 41 interrupting automatic processing, 41 manually, 25, 42 restarting automatic processing, 41 step-by-step, 42 steps, overview, 21 stopping automatic processing, 41 switching between manual and automatic processing, 25, 44 with OCR Wizard, 37 Proofed page, 26 Proofing in later sessions, 29 options, 31, 38, 60 Proofreading OCR results, 60 Proofreader dialog box, 38, 60 Properties of zones, 24,
Speed maximised, 31 Splitting zones, 54 Spreadsheet pages, 52 Standard toolbar, 22, 23 Starting a user dictionary, 63 Starting the program, 14 Step-by-step processing, 42 Stopping automatic processing, 41 Subtracting from zones, 54 Suggestion from dictionaries for proofing, 61 Supplementing template zones, 57 Supported file types, 79 Suspect words in proofing, 60 Switching between manual and automatic processing, 25, 44 Switching between Text Editor views, 64 System or performance problems during OCR, 82 Sy
alphanumeric, 54 auto-detect, 51, 55 changing types, 55 deleting a template, 57 drawing, 46 graphic, 55 ignore zone, 55 irregular, 24, 53 joining, 53 manual, 24, 53, 81, 82 modifying a template, 57 90 INDEX multiple column, 54 numeric, 54 on page, 26 properties, 24, 54 rectangular, 53 reordering, 24, 54 replacing a template, 57 resizing, 24, 53 saving a template, 57 setting types, 56 single-column, 54 splitting, 54 subtracting from, 54 supplementing templates, 57 table, 24, 55, 56 table zone tools, 24 t