TextBridge PRO User's Guide 8.
COPYRIGHT INFORMATION Copyright © 1996-1999 by ScanSoft Inc. All rights reserved. No part of this publication may be transmitted, transcribed, reproduced, stored in any retrieval system or translated into any language or computer language in any form or by any means, mechanical, electronic, magnetic, optical, chemical, manual, or otherwise, without the prior written consent of ScanSoft Inc., 9 Centennial Drive, Peabody, Massachusetts 01960. Printed in the United States.
CONTENTS PREFACE About This Manual ..................................ix Organization of this manual ........................ x Documentation conventions........................ xii Other Reading Material..............................xiii Customer Support ..................................xiv 1 INTRODUCTION TextBridge Pro Features and Benefits.................. 1–2 Productivity features unique to TextBridge Pro ........ 1–3 Other TextBridge Pro features.....................
3 TEXTBRIDGE PRO TOOLS Main Window .................................... 3–1 Main toolbar .................................. 3–2 Preferences panel .............................. 3–2 View area .................................... 3–2 Toolbars ........................................ 3–3 Main toolbar .................................. 3–4 Preview toolbar ................................ 3–6 Training toolbar ............................... 3–9 Preferences Panel ................................
TUTORIALS TextBridge Pro Interface ............................ 5–1 Sample Documents ................................ 5–3 Tutorial Session 1: Automatic Operation ................ 5–5 Tutorial Session 2: Capturing Parts of a Document ........ 5–7 Tutorial Session 3: Interactive Training ............... 5–12 Tutorial Session 4: Instant Access OCR ................ 5–15 Tutorial Session 5: Document Recomposition............ 5–17 Notes About Document Recomposition.............. 5–18 Where to Go From Here.....
A TROUBLESHOOTING AND ERROR CORRECTION What To Do if You Have a Problem .................... A–2 Error messages and possible solutions.................. A–3 Installation Problems ............................. A–22 Basic Troubleshooting for Installation Problems ...... A–22 Scan Problems................................... A–23 Basic Troubleshooting for Scan Problems ........... A–23 Problems using a TWAIN driver .................. A–24 Assorted scanner driver problems .................
C APPLESCRIPT INTERFACE Writing TextBridge Scripts .......................... C–2 Writing TextBridge Scripts .......................... C–2 Record a script................................. C–2 Sample scripts................................. C–4 TextBridge Pro Objects ............................. C–5 application.................................... C–5 docFormatter.................................. C–6 imageSource .................................. C–7 Recognizer...................................
PREFACE ScanSoft Inc. welcomes you to TextBridge® Pro 8.5 for Macintosh™. TextBridge Pro incorporates powerful optical character recognition (OCR) technology and an easy-to-use interface so you can quickly convert paper documents into fully-editable text files, complete with the original layouts. Files produced by TextBridge Pro are compatible with a variety of word processing, desktop publishing, data base, and spreadsheet applications.
Organization of this manual This manual is designed both as a training tool and a reference tool. It includes practical tips and techniques, troubleshooting and error correction, sample documents, and AppleScript information. It is organized as follows: x ◆ Chapter 1, “Introduction,” discusses TextBridge Pro features and benefits, lists the supported scanners, lists the supported output text formats, and discusses the AppleGuide online Help system.
◆ Chapter 5, “Tutorials,” walks you through several practice sessions designed to provide a firm basis on which to learn and use the important features of TextBridge Pro. ◆ Chapter 6, “Tips and Techniques,” provides practical suggestions for getting the best performance from TextBridge Pro. ◆ Appendix A, “Troubleshooting and Error Correction,” lists the error messages that can be generated during TextBridge Pro operation and suggests ways for correcting the errors.
Documentation conventions As described in Table P–1, TextBridge Pro documentation uses certain graphical elements and formatting to emphasize information and denote meaning in text. Table P–1. Documentation Conventions bold Introduces a new term, or the first use of an important term in a chapter; also sometimes used to denote strong in-line emphasis. italic Denotes titles of other manuals or books.
OTHER READING M ATERIAL TextBridge Pro provides a comprehensive set of documents designed to help you in fully learning and operating the product. In addition to this User’s Guide, refer to the following documentation for more information: ◆ ReadMe—After you install TextBridge Pro, please read the online ReadMe document, which automatically appears in the TextBridge Pro Folder: Simply double-click the ReadMe icon to view important, up-todate information that is not in the standard documentation set.
CUSTOMER SUPPORT If you should experience problems with TextBridge Pro that you cannot resolve, consult Appendix A, "Troubleshooting and Error Correction," for a list of error messages and ways to correct them. If you cannot resolve a problem on your own using the documentation and software, refer to the following Web site: www.scansoft.com The ScanSoft web site provides a link to TextBridge pages, including Frequently Asked Questions, and technical information bulletins.
1 INTRODUCTION Welcome to ScanSoft’s TextBridge® Professional Edition, the premier OCR software for Macintosh®. OCR stands for optical character recognition, the capability to recognize paper documents and output formatted, fullyeditable data (text and graphics) to a word processor, spreadsheet, or web browser format. OCR can also recognize online page images from fax modems, scanners, and other sources.
TEXTBRIDGE PRO FEATURES AND BENEFITS Using ScanSoft’s latest document recognition technology, TextBridge Pro is the first and only OCR software that can produce a fully-editable electronic document that retains the original document layout, complete with text and pictures (Figure 1–2). Original document Recomposed document in word processor Figure 1–2.
Productivity features unique to TextBridge Pro TextBridge Professional Edition is the first and only desktop document recognition software product to offer these major features: ◆ Instant Access OCR™. You can run TextBridge Pro from within virtually any Macintosh text application. It then automatically pastes recognized document data (text and pictures) directly into the host application’s open document. ◆ Dynamic Training.
◆ Image processing. TextBridge Pro provides the widest support of images from a variety of sources. Specifically, the program imports and recognizes on-line document images in TIFF and PICT formats originating from fax modems and other sources. ◆ Deferred processing. TextBridge Pro enables you to scan all pages of a document to TIFF or PICT image files, then later queue up the image files for document recognition. ◆ AppleScript interface.
◆ Zone Templates (re-usable). After you create a set of zones in the preview window, TextBridge Pro enables saving and reloading of these zone templates for subsequent jobs. ◆ Dynamic Training Data (re-usable). After you interactively train TextBridge Pro during OCR, you can save the training data. Later, you can reload this training file for documents of the same type to assure the highest recognition accuracy without having to repeat the training. ◆ Custom Dictionaries.
Documents TextBridge Pro can recognize TextBridge Pro includes a number of advances developed by ScanSoft and by the famed Xerox Palo Alto Research Center (PARC) where modern computer interfaces were born.
SUPPORTED TEXT F ORMATS TextBridge Pro can convert recognized text to a number of word processing and other formats for both Macintosh and PC platforms: Ami Pro dBase DisplayWrite (DCA-RFT) Formatted ASCII FrameMaker HTML Interleaf (ILF) Lotus 1-2-3 MacWrite 4.x, 5.0 MacWrite II Microsoft Excel Microsoft Word (RTF) MultiMate PCL/PostScript WordPerfect 1.0 WordPerfect 3.1 WordPerfect DOS 5.
To maintain the “what-you-see-is-what-you-get” characteristics of the document, use a fixed-width font such as Courier. This format is most useful for documents that you do not intend to edit or tables and numeric data. TextBridge Pro also includes a markup format called XDOC. XDOC can be used for conversion to third-party formats.
ON-LINE HELP FOR TEXTBRIDGE PRO TextBridge Pro is designed to be easy to learn and use. However, if you need assistance, the program provides a complete Apple Guide on-line Help system as well as Balloon Help. While running TextBridge Pro, you can access the TextBridge Pro Guide by selecting it from the Help menu: On the TextBridge Pro Guide window, click Topics to display a list of general categories (Figure 1–3); click Index to see a list of keywords; click Look For to search for help. Figure 1–3.
Once a Guide window is displayed for a particular topic (Figure 1–4), you can do the following: ◆ Read the text or do the step described in the Guide window, then click the right arrow at the bottom of the window to go to the next step. (To see the previous window, click the left arrow.) ◆ You can move the TextBridge Pro Help window if it covers what you want to see. ◆ To shrink the Help window, click the box at the upper-right corner of the window. Click the box again to expand the window.
WHERE TO GO FROM HERE To install TextBridge Pro, go to Chapter 2. If you want to study TextBridge Pro in more detail, Chapter 3 provides a complete reference to the user interface including window areas, menus, commands, and tools. If you are ready to use TextBridge Pro, see Chapter 4, which provides step-by-step procedures to complete the many tasks you can perform with the program.
2 INSTALLATION This chapter describes the TextBridge Professional Edition software installation procedures. Specifically, it covers these topics: ◆ System configuration and performance ◆ Installing and testing your scanner ◆ Installing TextBridge Pro Software ◆ De-installing TextBridge Pro Software It is recommended that you read through the first two sections before proceeding with software installation. However, if you are ready to begin software installation, please turn to page 2–3.
If you regularly intend to scan multiple-column or landscape pages of text, pages with complex layouts, or large image files, you should configure your Macintosh with 12 to 16Mb of RAM. Note If you plan to run TextBridge Pro in Instant Access mode, you will need enough memory to run both TextBridge Pro and your word processor or spreadsheet application at the same time. INSTALLING AND TESTING YOUR SCANNER TextBridge Pro works with many popular desktop scanners.
Basic scanner installation steps The basic steps for installing a scanner are to: 1. Hook up the scanner to the SCSI port or USB port (for USB Macintoshes) with the correct cable, and power up the scanner and the Macintosh. Refer to your scanner documentation for complete instructions. 2. Install the scanner driver on your Macintosh hard disk, as directed by the scanner documentation. 3. Test the scanner using software tools provided by the manufacturer.
Run the TextBridge Pro Installer The TextBridge Pro Installer copies TextBridge Pro software to your hard disk, placing most files in the folder of your choice, and some selected files in the System Folder. Note The TextBridge Pro Installer will alert you to restart your Macintosh after completing an installation. To install TextBridge software, use the following procedure: 1.
Figure 2–1. The TextBridge Pro folder containing the TextBridge Pro files and the Installer icon 3. Double-click the TextBridge Pro Installer icon: The TextBridge Professional Splash Screen (Figure 2–2) displays. Click Continue to proceed with installation Figure 2–2.
4. Press Continue on the Splash Screen. The next screen to appear shows the online release notes. Read, save, or print the Release Notes for the latest information, then Press Continue to proceed. Figure 2–3. The Installer’s Display of Online Release Notes 5. Read, save, or print the release notes for the latest information, then press Continue. 6. Choose an installation option.
☞ Use Custom Install to save disk space, or if you have already installed TextBridge Pro 8.5 software, and you want to add options, such as language packs, or scanner drivers. A full installation (Easy Install) requires approximately 20,400k disk space. To use Easy Install, go to Step 7; for Custom Install, go to Step 8. 7. Perform an Easy Install. With Easy Install selected, click on Install as shown in Figure 2–4 below. Go directly to Step 9.
Click on a box to add all related options to your hard drive Figure 2–5. TextBridge Professional selection box Click a disclosure triangle to display all related options Click on a box to select that option Click on "I" to display information about any option Click "OK" to hide the information dialog box Figure 2–6. TextBridge Professional, and an information dialog box displayed 9. Specify the location and name of the folder where you want to install TextBridge Pro, then click Install.
When the installation is complete, the Installer displays a message asking you to restart your system (Figure 2–7). Figure 2–7. Installation Complete dialog box 10. Click Restart. 11. If you are using a scanner, go on to select a scanner driver. See the next section, “Select a scanner driver,” or If you plan to use TextBridge Pro to process on-line images only, you can skip the next section and begin using TextBridge Pro.
2. Start TextBridge Pro. Double click the TextBridge Pro icon: If your version of TextBridge Pro has built-in electronic registration, TextBridge Pro displays an introductory screen followed by registration information. Follow the onscreen instructions to register. After registering your software, the TextBridge Pro Main window will appear (Figure 2–8). Note Unless you register your software, there will be a reminder to register the first three times you start up TextBridge.
3. Display the Select Source dialog box. Choose Select Source from the Scanner menu. TextBridge Pro displays the Select Source dialog box (Figure 2–9). Identify the type of scanner driver you want to select Select the appropriate source, plug-in, or ISIS driver Click to complete selection Figure 2–9. Select Source dialog box 4. Select the type of scanner driver. If you have installed the selected type of driver correctly, it will appear in the list box below the scanner driver types.
☞ If you are using a TWAIN source to drive your scanner, you may also choose whether or not to display the TWAIN user interface when scanning from TextBridge Pro. In most cases it is best to display the interface; however, scanner settings will be grayed out in the TextBridge Main window. 6. Click OK to close the Select Source dialog box. If TextBridge is not able to find your scanner, restart your system with the scanner turned on, and try selecting the scanner again. 7. Begin using TextBridge Pro.
UN -INSTALLING TEXTBRIDGE PRO To restore your Macintosh to the state it was in before you installed TextBridge Pro, use the Uninstall option in the TextBridge Pro Installer. 1. Insert the TextBridge Pro CD-ROM into your CD-ROM drive. 2. Double-click the TextBridge Pro Installer icon: 3. Press Continue on the Splash screen (Figure 2–2) and Release Notes screen (Figure 2–3) to display the Installer Screen (refer to Figure 2–4). 4. Select Uninstall from the installation menu.
TextBridge Pro Folder ReadMe TextBridge Professional AppleGuide Help ReadMe Support Sample AppleScripts Sample Docs TextBridge® Pouch Language packs Custom dictionaries Zone templates Text conversions Training data System Folder Fonts Preferences Xerox fonts TWAIN Apple Menu Items Instant Access OCR TextBridge® Pro Preferences Source Manager Figure 2–10.
TEXTBRIDGE PRO TOOLS 3 This chapter provides a complete reference to TextBridge Professional Edition. Specifically, the following topics are presented: ◆ ◆ ◆ ◆ MAIN W INDOW Main window Toolbars Preferences Menus and commands The control center for TextBridge Pro operation is the main window. With the exception of several dialog boxes, all preparation and document recognition activity takes place in the main window.
The features of the main window are listed below, and are described in the subsections that follow: Main toolbar ◆ main toolbar ◆ preferences panel ◆ view area The main toolbar, which appears directly beneath the title bar, allows you to quickly set up the type of process you want to complete, and to begin the process. It also displays TextBridge Pro status. During a job, the status area provides messages to update you about the various stages of processing.
TOOLBARS For quick and easy setup and operation, TextBridge Pro provides several toolbars. Requiring only a few mouse clicks, toolbars enable you to control the document recognition process almost completely from the main window. Two types of buttons reside on TextBridge Pro toolbars. Command buttons, when pressed, immediately perform an action. These buttons behave as if they are “spring-loaded.
The following subsections provide a closer look at the TextBridge Pro toolbars, specifically the: ◆ Main toolbar ◆ Preview toolbar ◆ Training toolbar Main toolbar The main toolbar (Figure 3–2) is central to all TextBridge Pro operations. You use it to define the image source (scanner or file), the mode of operation (states), and to start, continue, or cancel part or all of the process (commands).
Table 3–1. Main Toolbar Buttons (cont.) The Input From File button instructs TextBridge Pro to obtain page images from on-line image files. When you click the Go button, the Image Queue dialog box is displayed, and you can identify one or more image files to process. The Save Page Images – Defer OCR button enables you to scan all pages of a document to image files for later processing.
Table 3–1. Main Toolbar Buttons (cont.) The Cancel Page button cancels processing of, and discards data from, the current page. If there is a next page, TextBridge Pro continues processing. The Stop button cancels processing of the current job. If you have already processed at least one full page of a document, TextBridge Pro asks if you want to save the recognized data, or discard it, or continue processing. The Go button starts processing, and when you are working in preview mode, continues processing.
When TextBridge Pro acquires a page, it displays the page image in the view area of the main window, and adds the preview toolbar. Table 3–2 describes the preview toolbar buttons in more detail. Table 3–2. Preview Toolbar Buttons Press in the Zoom In button to change the mouse pointer to a zoom icon when you place it in the view window. Point to any area of the page image and click once to zoom in to this area. Keep clicking to continue zooming in.
Table 3–2. Preview Toolbar Buttons (cont.) Use the Create Text Zone button to change the mouse cursor to a cross-hair. Place the cross-hair at the corner of a text area you want to capture from the displayed page image, click and drag the mouse diagonally to create the text zone. Release the mouse when you are done. Use the Create Image Zone button to change the mouse cursor to the image zone cross-hair.
Training toolbar The training toolbar (Figure 3–4) appears in the main window when you start a job in interactive training mode. To access interactive training mode, you press in the Train OCR button (left) on the main toolbar, then start processing a document. Zoom Out This box shows suspect words Zoom In Press to accept the suspect word. Correct it first if necessary. Training Level Figure 3–4.
Table 3–3. Training Toolbar Buttons (cont.) Press the Zoom Out button to change the mouse pointer to the Zoom Out icon when you place it in the view window. Click once to zoom out. Keep clicking to continue zooming out. To zoom all the way out so that the entire page is visible, hold down the option key while clicking. Training Level options control the sensitivity of the training process, how frequently suspect words will be displayed for your input. Some Words is the default.
PREFERENCES PANEL TextBridge Pro is designed so that you can process many documents with little or no setup. However, to get the best recognition for some documents, you can fine-tune TextBridge Pro by setting preferences. Some preferences, such as recognition language, you may rarely need to change. Other preferences, such as scanner brightness, you may need to adjust frequently from job to job.
MENUS AND COMMANDS The TextBridge Pro menu bar provides six pull-down menus that provide access to all the commands available for starting and completing an OCR job. This section provides information about the menus and the commands they hold. It covers the following topics: ◆ File menu ◆ Edit menu ◆ View menu ◆ Process menu ◆ Recognize menu ◆ Scanner menu File menu The File menu holds four commands.
Input From Scanner The Input From Scanner command is equivalent to the Input from Scanner button on the main toolbar. The Input From Scanner command, when it has a check mark next to it, instructs TextBridge Pro to use the attached scanner as the source of pages to be recognized. Input From File The Input From File command is equivalent to the Input from File button on the main toolbar.
Enter the base name for image files Click to begin scanning Figure 3–7. Save dialog box Each scanned image uses the base name plus a three-digit identifying number. For example: base001 base002 base003 . . . TextBridge Pro allows you to save page images in PICT or TIFF (Uncompressed, CCITT Group 3, CCITT Group 4, or Packbits).
Quit The Quit command quits TextBridge Pro. If you have processed at least one page of a document when you select the Quit command, TextBridge Pro will display a dialog box asking if you want to end the document, discard it, or continue processing (Figure 3–8). Figure 3–8. Discard, End, or Continue dialog box Edit menu The Edit menu provides eight tools that are useful when you are working in preview mode or entering text in a dialog box.
Undo The Undo command performs a variety of undo tasks, depending on which stage of the job you are in. For example, if you are previewing a document, and you have moved a zone, the Undo command changes to Undo Edit Zone. Cut The Cut command is active only when you are editing a text string. This command deletes the current selection and stores it in the Clipboard. Copy The Copy command enables you to copy text from a text box onto the Clipboard. The Copy command is dimmed unless you are editing text.
Select All The Select All command is active only when you are editing a text string. This command selects all text in the active text box. Clear All Zones The Clear All Zones command is active only when you are in preview mode and at least one zone has been defined. Clear All Zones deletes all defined zones. Move To Front The Move To Front command is active only when you are in preview mode, and you have a zone selected. This command moves the selected zone in front of all other zones in the view area.
View menu The View menu holds four commands that control the page image in the view area. View commands are available to zoom the view area in both preview and interactive training modes. The Invert and Deskew commands are only available in preview mode. The View commands, listed below, are described in more detail in the following subsections.
Invert may be most useful for processing documents received from a fax modem or TWAIN source. These types of documents sometimes have white text on a black background. Such documents must be inverted before TextBridge can perform OCR. Deskew The Deskew command is active only when TextBridge Pro is in preview mode. This command straightens the current page image if it is incorrectly aligned. Deskew is only available once per page.
Process menu The Process menu contains five commands that enable you to turn on and off preview and interactive training modes and start and stop a job. The following subsections describe the commands in the Process menu, namely: ◆ Preview ◆ Train OCR ◆ Cancel Page ◆ Stop ◆ Go/Continue Preview The Preview command, when selected, places TextBridge Pro in preview mode. It is the same as pressing the Preview button on the main toolbar.
You can select the Train OCR command at the beginning of, or during, a job. The first (or next) page to be processed is displayed in the view area of the main window. TextBridge Pro displays the training toolbar with the first suspect word in the Word text box, then waits for your input. This enables you to interact with the OCR process to achieve the highest level of recognition accuracy, and to have TextBridge Pro learn from your input.
Go/Continue The Go command and the Go button in the main toolbar are equivalent. The Go command starts the TextBridge Pro process— either scanning a page or reading from an on-line image file. If OCR is already in progress, the Go command is dimmed. In preview mode, the Go command becomes the Continue command. So, after you view, zoom, and zone the page in preview, you can select Continue to start recognition of the page.
Input Layout The Input Layout submenu displays settings that inform TextBridge Pro about the column layout of, and whether there are pictures in, the original document. This submenu is equivalent to the Input Layout pop-up menu on the main window. Refer to Chapter 6 for more information about the Input Layout settings and when to use them. Output Layout The Output Layout submenu displays settings that tell TextBridge Pro how to compose the output document in your word processor.
Page Orientation The Page Orientation submenu provides settings that tell TextBridge Pro about the orientation of the page, or allow TextBridge Pro to determine the orientation automatically. This submenu is equivalent to the Page Orientation pop-up menu on the main window. Refer to Chapter 6 for more information about the Page Orientation settings and when to use them. Recognition Language The Recognition Language submenu lists the available TextBridge Pro language packs.
Zone Template The Zone Template submenu lists the sets of zones that you previously created and saved in a template file in the TextBridge® Pouch. This submenu is equivalent to the Zone Template pop-up menu on the main window. It is active only when TextBridge Pro is in ready mode (a job is not in progress), or in preview mode when a static page image is displayed in the view area. When you do not want to use a zone template, be sure to select “None.
Save Zone Template The Save Zone Template command is active only when you are in preview mode, and you have created at least one zone on the page image in the view area. When you select the command, it displays the Save Zone Template dialog box (Figure 3–9). Specify the name of the new template file Click to save Figure 3–9. Save Zone Template dialog box Here, you can save the currently displayed zone set in a template file.
Save Training Data The Save Training Data command enables you to save training data. This command displays a dialog box to let you save the training data to a named file (Figure 3–10). Specify the name of the new training file Click to save Figure 3–10. Save Training Data dialog box The Save Training Data command is active only when you are in preview mode and you have accepted or corrected any suspect words while training on an earlier page.
Scanner menu The Scanner menu provides five commands that let you fine-tune the scanning and document recognition process. From the Scanner menu, you can select a scanner and define the full set of scanner preferences available in TextBridge Pro.
Identify the type of scanner driver you want to select Select the appropriate source, plug-in, or ISIS driver Click to complete selection Figure 3–11. Select Source dialog box The selections are TWAIN, Adobe Photoshop Import Plug-in, or Chooser extension (ISIS) driver. Select the appropriate type, then select the appropriate driver from the list. If using TWAIN, you can also choose whether or not to display the TWAIN user interface. Then click OK to complete the process.
Page Size The Page Size submenu lists the available page sizes for your selected scanner. This submenu is equivalent to the Page Size pop-up menu on the main window. See Chapter 6 for more information about the Page Size setting. Resolution The Resolution submenu enables you to set the appropriate resolution for your scanner. This submenu is equivalent to the Resolution pop-up menu on the main window. Refer to Chapter 6 for more information about the Resolution setting.
WHERE TO GO FROM HERE With an understanding of TextBridge Pro programs and tools provided by this chapter, you are ready to use the application for your own documents. Chapter 4, “Using TextBridge Pro”, provides step-by-step procedures for the many tasks you can perform with the program. Chapter 5, “Tutorials”, provides step-by-step practice sessions to introduce you to some of the most important capabilities of TextBridge Pro.
4 USING TEXT BRIDGE PRO The previous chapters have been introductory or reference in nature.
When you change any of the defaults, the new preferences become the defaults until you change them again. TextBridge Pro assumes that these are your preferred settings. Two types of preferences are provided in TextBridge Pro—job preferences and scanner preferences. For your convenience, preferences appear on the preferences panel (Figure 4–1) in the main window. ☞ Initially, only the four most commonly used controls are displayed.
Setting job preferences For TextBridge Pro, the job preferences shown in Figure 4–1 help to define the features of a specific document and how you want it to be processed. Table 4–1 describes job preferences and how to use them. Table 4–1. Job Preferences Input Layout These settings inform TextBridge Pro about the column layout of the original document, and whether it contains pictures. Select Text: One column for simple one-column documents without pictures, cell tables, or spreadsheets.
Table 4–1. Job Preferences (cont.) Output Layout These settings tell TextBridge Pro how to compose the output document in your word processor format. Note that the capability of TextBridge Pro to reconstruct the original document layout is limited to the capabilities of your word processor or text application. Select Text: One column if you want the text of the document in simple, editable form in your word processor or other text application.
Table 4–1. Job Preferences (cont.) Original Quality Select Normal if the original documents are good quality. Select Fax if the page images are from fax modems, scanned hard-copy faxes, or any document scanned at 200 dots per inch or lower resolution. Select Dot Matrix if the documents are printed on a draft-quality dot-matrix printer. Characters from these printers are made up of disconnected dots, and could otherwise be difficult for an OCR program.
Table 4–1. Job Preferences (cont.) Page Orientation Click Portrait for most typical portrait-oriented office documents. Click Landscape for landscape documents that you would typically scan in sideways. TextBridge Pro rotates these pages in memory by 90-degrees before beginning recognition. Click Automatic to have TextBridge Pro automatically determine the orientation of the page before sending it to OCR. This option is useful if your document contains a mixture of page orientations.
Table 4–1. Job Preferences (cont.) Custom Dictionary This pop-up menu provides a list of the custom dictionaries available in the TextBridge® Pouch. A custom dictionary is a plain text (ASCII) file that you create by entering words that would not likely be found in a standard dictionary. Such words can be proper names, professional or technical terms, acronyms, and so on. Before you begin a job, you can load a custom dictionary to improve recognition of a particular document.
Table 4–1. Job Preferences (cont.) Zone Template This pop-up menu lists the zone templates available in the TextBridge® Pouch. It is active only when TextBridge Pro is in ready mode (a job is not in progress), or in preview mode when a static page image is displayed in the view area. Before you begin a job, you can load a set of zones previously created for a document with a similar layout. Make certain that the zone template is appropriate for the current document.
Setting scanning preferences Scanner preferences control your scanner and the images that it provides to TextBridge Pro for recognition. ☞ Scanner capabilities vary, thus some preferences may not be available for your scanner. If you choose to display the TWAIN user interface, or are using an Adobe Photoshop Import Plug-in, which always displays a scanning user interface, the scanner settings options on the TextBridge Pro main window will be grayed out.
Table 4–2. Scanner Preferences (cont.) Page Size This setting lets you control the size of the area the scanner will scan. Specify the smallest size that accommodates the size of your original pages: ◆ US Letter (8.5-by-11 inches or 21.59-by-27.94 centimeters) ◆ Legal (8.5-by-14 inches or 21.59-by-35.56 centimeters) ◆ A4 (8.27-by-11.69 inches or 21-by-29.
Table 4–2. Scanner Preferences (cont.) Sheet Feeder If your scanner has a sheet feeder, click this option on to scan pages from the sheet feeder. Click this option off if you want to scan from the flatbed. The Sheet Feeder option controls whether TextBridge Pro will automatically pull pages from the sheet feeder. Some scanners sense a page in the sheet feeder and will scan from there even if the sheet feeder option is off.
SCANNING AND CONVERTING A D OCUMENT One of the tasks that TextBridge Pro performs is scanning a hard copy document to an on-line text file. The document can comprise one page or many pages, and can be single- or double-sided.
3. On the main toolbar, identify your scanner as the input source by depressing the Input From Scanner button. Now click the Go button. The Save dialog box is displayed (Figure 4–5). Type a new name or accept the default name Select the output format Click Continue to save Figure 4–2. Save dialog box 4. Specify the name, location, and format of the text output file and click Continue.
☞ If you are driving your scanner with a TWAIN source (displaying the TWAIN user interface), or with an Adobe Photoshop Import Plug-in, the TWAIN or Plug-in user interface will appear, where you can change scanner settings and direct the scanner to scan. For best OCR results, select lineart and a resolution of 200, 300, or 400 dpi from the scanner manufacturer’s interface. When scanning is completed, TextBridge Pro displays the Add More Pages dialog box (Figure 4–3). Figure 4–3.
Scanning a double-sided document Many multiple-page documents are printed double-sided; that is, both the front (odd-numbered) and reverse (even-numbered) sides of pages contain print. If your scanner has a sheet feeder, you can use TextBridge Pro’s powerful auto-collation feature to scan double-sided documents. This feature enables you to process the front sides of pages first, then turn the stack over in the sheet feeder, and process the reverse sides.
4. Specify the name, location, and format of the text output file and click Continue. TextBridge Pro automatically scans and processes the pages that you loaded into the scanner. ☞ If you are driving your scanner with a TWAIN source (displaying the TWAIN user interface), or with an Adobe Photoshop Import Plug-in, the TWAIN or Plug-in user interface will appear, where you can change scanner settings and direct the scanner to scan.
However, because recognition can be time-consuming, TextBridge Pro enables you to perform the two stages of document recognition separately. That is, you can scan all the pages of the document without OCR taking place. Then, later, you can queue up the page images of the document for OCR, and go home or perform other tasks while OCR is taking place. This is referred to as deferred processing.
4. On the main toolbar, depress the Save Page Image – Defer OCR button. 5. Click the Go button on the main toolbar. TextBridge Pro now displays the Save dialog box (Figure 4–4). Click to begin scanning Enter the base name for image files Figure 4–4. Save dialog box to save the image file 6. Define the base name, location, and format of the page image files to be saved. Each scanned image uses the base name plus a three-digit identifying number.
☞ Click the New folder button to create a document folder where you can save all the page images for the document. Later, when you want to OCR these pages, simply highlight the folder in the Image Queue dialog box to add the pages to the queue in alphanumeric or alphabetical order. TextBridge Pro allows you to save page images in PICT or TIFF (Uncompressed, CCITT Group 3, CCITT Group 4, or Packbits). 7. When you have specified the page image information in the Save dialog box, click Continue.
10. Proceed from Step 8 to continue or end the job. 11. To end the job, click End in the Add More Pages dialog box. At any time, you can go on to queue up the saved page images for document recognition. For information, refer to the next section, “Recognizing and Converting Image Files.” RECOGNIZING AND CONVERTING IMAGE FILES The second phase of deferred processing is to run document recognition on saved page image files.
To queue up and process on-line page images, use the following procedure: 1. On the TextBridge Pro main toolbar, depress the Input From File button. ☞ Make sure that the Save Page Image – Defer OCR button is no longer depressed. 2. Prepare the job. For complete information, refer to “Preparing the Job” earlier in this chapter. In particular, three job preferences can be important for processing on-line image files.
Double-click a file on the list, or highlight a file and click Add. After you select the files, click Proceed. Files you have added are listed here. Figure 4–5. Image Queue dialog box 4. In the Image Queue dialog box, select the image files in the order in which you want them to be processed. In the area at the top of the Image Queue dialog box, select each image file you want to recognize, and click Add (or just double click the file).
Order the image files in the queue using the numbers in the names as a guide. Note Files are processed in the order in which you add them to the queue. Unless you add a folder of files to the queue, files are not automatically added in alphanumeric or alphabetic order. 5. After you queue the image files in the correct processing order, click Continue in the Image Queue dialog box. TextBridge Pro displays the Save dialog box (Figure 4–2). 6.
Note The following procedure assumes that if you are using a scanner, it is properly connected to your Macintosh, powered on and ready, and that the TextBridge Pro main application is active. 1. If you are scanning, load the page(s) into your scanner, then go to Step 2. Otherwise, start at Step 2. 2. On the TextBridge Pro main toolbar, define the image source by clicking either the Input From Scanner, or the Input From File button. Also, depress the Preview button. 3. Prepare the job.
TextBridge Pro acquires the page, and displays the image in the view area of the main window. It also adds the Preview toolbar to the main window (Figure 4–6). Preview toolbar is added Page image is displayed Figure 4–6. Main window in preview mode Scroll bars in the view area let you shift the display horizontally and vertically. 6. Zoom the page if desired.
7. Create and edit text, image, and ignore zones, as appropriate. To create a text zone, depress the Text Zone button on the preview toolbar. To create an image zone, depress the Image Zone button on the preview toolbar. To create an ignore zone, select the Ignore Zone button on the preview tool bar. Move the mouse pointer into the view area, and point to a corner of the area to be zoned. Click and hold the mouse button, and drag the mouse diagonally and downward.
To move the selected zone, click and hold on a border of the zone and drag the mouse. To delete the selected zone, pull down the Edit menu and choose the Clear command (or simply press the Delete key). To delete all zones, pull down the Edit menu, and choose the Clear All Zones command. To change the front-to-back order of the selected zone, pull-down the Edit menu, and choose the Move to Front or Move to Back command, as appropriate.
TRAINING TEXTBRIDGE PRO DURING R ECOGNITION TextBridge Pro enables you to interact with the OCR process to accept or correct its recognition decisions. This is referred to as interactive training mode. During this process, TextBridge Pro compiles information about the character shapes, styles, and sizes found in the document being recognized. With your help, the program continually fine-tunes this training data to improve recognition for the second and later pages of a document.
☞ If you are driving your scanner with a TWAIN source (displaying the TWAIN user interface), or with an Adobe Photoshop Import Plug-in, the TWAIN or Plug-in user interface will appear, where you can change scanner settings and direct the scanner to scan. For best OCR results, select lineart and a resolution of 200, 300, or 400 dpi from the scanner manufacturer’s interface.
6. If necessary, correct the suspect word in the Word text box. When the word is correct, click the Accept button. TextBridge Pro continues OCR, then displays the next suspect word in the Word edit box. 7. Repeat Step 6 until you have trained TextBridge Pro on enough words. Usually, interactive training on one page of a multiple-page document is enough to train TextBridge Pro on the current document.
Sometimes, TextBridge Pro will land on a non-word (a mark on the page, a horizontal line, other noise). The text box may contain some characters, while the image area shows the non-word highlighted. In these cases, delete all the text in the Word text box, if any, then click Accept. TextBridge Pro will ignore the noise, and proceed to the next questionable word. The image in the view area is zoomed in to approximately the middle of the zoom range.
☞ You can save training data before the end of the job if you trained OCR on one or more words on an earlier page in the document. While TextBridge Pro is paused in Preview mode, pull down the Recognize menu and select the Save Training Data command. When you are done, you can go on to use the recognized text by editing the output file in your word processor or other text application.
To run TextBridge Pro from, and import recognized text directly into, the host application’s open document, use the following procedure: 1. Start the host application, and open a document into which you can import the recognized data. Set the insertion point where you want TextBridge Pro to paste text in the document. 2. Pull down the Apple menu and select the Instant Access OCR command. In a few moments, the TextBridge Pro main window appears in Instant Access mode (Figure 4–9).
Notes Although Save Page Image – Defer OCR is turned off, you can still select this option. If you do so, TextBridge Pro leaves Instant Access mode; when you start processing, TextBridge Pro displays the Save dialog box, where you must specify a filename and destination to continue processing. During normal processing, the Stop button is only enabled after you click Go to begin processing. In Instant Access mode, the Stop button is always enabled.
WHERE TO GO FROM HERE With the procedures in this chapter, you can run virtually all the capabilities of TextBridge Pro. For more advanced information, see Chapter 6, “Tips and Techniques.” That chapter takes a closer look at ways to get the highest recognition accuracy and the most efficient performance from TextBridge Pro.
5 TUTORIALS This chapter provides step-by-step tutorials designed to introduce you to some of the most important capabilities of TextBridge Professional Edition.
Tool area Preferences view bar View area for page images Figure 5–1. TextBridge Pro main window The main window follows Macintosh standards. The upper right corner provides the standard zoom box which allows you to toggle the window size. Below the TextBridge Professional Edition title bar is the main window which provides a main toolbar and a preferences panel. These tools let you set up, start, and control the document recognition process. ☞ Initially, only one row of preference pop-up menus is shown.
Preview toolbar Training toolbar Figure 5–2. Preview and training toolbars Below the toolbar area, the largest area of the main window is the view area. Here, depending on the processing stage (preview, training), different views of the page image appear. At the upper left of the main window, the status area displays messages that indicate the status of the job. Except for an occasional dialog box, all program activities take place in the main window.
To identify an on-line sample document for TextBridge Pro to use, follow these steps: 1. On the TextBridge Pro main toolbar, push in the Input from File button: 2. Click the Go button. TextBridge Pro now displays the Image Queue dialog box (Figure 5–3). Double-click a file on the list, or highlight a file and click Add. After you select the files, click Proceed. Files you have added are listed here. Figure 5–3. Image Queue dialog box 3.
4. Click a file name to select it. 5. Click the Add button to add the selected file to the list on the lower left side of the Image Queue dialog box. 6. Click Proceed to begin the TextBridge Pro process. Please now proceed to the tutorial sessions to work with TextBridge Pro and familiarize yourself with its capabilities. TUTORIAL SESSION 1: AUTOMATIC OPERATION TextBridge Pro provides a range of features designed to be very easy to use.
Type a new name or accept the default name Select the output format Click Continue to save Figure 5–4. Save dialog box 4. In the Save dialog box, define the output file. In the Save Output As text box, type a file name. In the Text pop-up menu, select the output format for your word processor, spreadsheet, or web browser application. Click Proceed to start processing. TextBridge Pro reads the online image, which appears in the view area of the main window, and automatically performs OCR on it.
5. Open the file with your word processor or other application. Compare the recognized document in your word processor with the picture of the sample document, markplan in Appendix B of this document. With a word processor such as Word or WordPerfect, the recognized document should look virtually identical to the TIFF image. The difference is now you have formatted, fully editable text.
3. Click the Go button on the main toolbar. 4. In the Image Queue dialog box, locate and select the sample document, zonepic, then click Proceed. TextBridge Pro displays the Save dialog box (Figure 5–4). 5. Define the output text file, then click Continue in the Save dialog box. TextBridge Pro reads the on-line image and in a few moments, displays it in the view area of the main window (Figure 5–5). Preview toolbar is added Page image is displayed Figure 5–5.
6. Zoom out on the page. Select the Zoom Out tool: Position the mouse inside the view area at the upper left corner of the page image, and click once to display more of this page area. 7. Create a text zone. Select the Text Zone tool: Position the mouse inside the view area at the upper left corner of the page image. Holding down the mouse button, drag the mouse diagonally downward until the text zone rectangle outlines a block of text to be recognized (Figure 5–6). Release the mouse button. 8.
Text zone identifies area to be recognized Figure 5–6. Text zone on previewed page 9. Now zoom in on the page. Select the Zoom In tool: Click once on the line art at the bottom right of the page image to magnify the area. 10. Create an image zone. Click the Image Zone tool: Position the mouse at the upper left of the line art on the previewed page.
Holding down the mouse button, drag the mouse diagonally downward until the image zone rectangle outlines the line art (Figure 5–7). Release the mouse button. Image zone identifies graphic to be captured Figure 5–7. Image zone on the previewed page 11. Click the Go button again to process the zoned text and image: When processing is complete, TextBridge Pro converts and saves the recognized data and returns to Ready status. 12. Open the file with your word processor or other text application.
TUTORIAL SESSION 3: I NTERACTIVE TRAINING To assure the highest possible accuracy, TextBridge Pro provides an interactive training capability. This feature enables you to participate in the OCR process, verifying correctly recognized words and correcting recognition errors. Interactive training is especially effective for degraded documents such as faxes and multi-generation photocopies.
4. In the Image Queue dialog box, locate and select the sample document, plexis, then click Proceed. TextBridge Pro now displays the Save dialog box (Figure 5–4). 5. Define the output text file, then click Continue in the Save dialog box. After beginning recognition, TextBridge Pro adds the training toolbar to the main window. When it finds the first suspect word, it displays the suspect word in the Word text box. The image of the word is highlighted immediately below in the view area (Figure 5–8).
6. Go on to train TextBridge Pro on the suspect words. If the word is correct, simply press the Enter key on your keyboard or click the Accept button on the Training toolbar: If the suspect word is incorrect, correct it in the text box, then accept it. Continue correcting and/or accepting words at your option. You can complete the entire page or only a portion.
8. Open the file with your word processor. Notice that even though the input document was a low-quality fax image, TextBridge Pro recognized it with a high degree of character recognition and formatting accuracy. TUTORIAL SESSION 4: I NSTANT ACCESS OCR TextBridge Professional Edition is the only high-end document recognition program that can be launched from virtually any Macintosh text application and can automatically paste recognized data directly into that text application’s open document.
Indicates Instant-Access mode Figure 5–10. Main window in Instant Access mode 3. On the main toolbar, select the Input from File button. If the Training button is still selected from the last tutorial session, you should de-select it. 4. Click the Go button on the main toolbar. 5. In the Image Queue dialog box, locate and select the sample document resume, then click Proceed. TextBridge Pro reads the on-line image and automatically performs OCR on it.
6. You can now go on to edit or otherwise use the recognized text in your text application. 7. To quit TextBridge Pro, select the Quit command from the File menu. TUTORIAL SESSION 5: D OCUMENT RECOMPOSITION TextBridge Pro is the first OCR application capable of recomposing the layout of a document, including text and graphics while maintaining full editability in the output file. Consider, for example, a three-column newsletter that includes a picture.
3. Click the Go button on the main toolbar. 4. In the Image Queue dialog box, locate and select the sample document 3col; then click Proceed. TextBridge Pro displays the Save dialog box (Figure 5–4). 5. Define the output text file, then click Continue in the Save dialog box. TextBridge Pro reads the on-line image, automatically performs OCR on it, and converts the recognized text. When it is finished, it returns to Ready status. 6. Open the file with your word processor.
TextBridge Pro provides two recomposition modes. In one (Recompose Text) only the text is recomposed and the images are discarded. This is useful, for example, when your document contains cell tables and you want them output as cell tables. Word and WordPerfect support tables (data in rows and columns, which are referred to as cell tables).
WHERE TO GO FROM HERE The tutorial sessions in this chapter were designed to give you a solid basis on which to use TextBridge Pro for your own documents. For complete information about TextBridge Pro, please refer to the User’s Guide or to the on-line TextBridge Pro Guide available from the Help menu.
6 TIPS AND TECHNIQUES This chapter describes how to maximize TextBridge Pro’s document recognition results. Specifically, this chapter covers the following topics: ◆ Getting the best document recognition ◆ Tips for efficient processing GETTING THE BEST DOCUMENT R ECOGNITION TextBridge Pro achieves a consistently high level of character recognition accuracy over a wide range of documents.
Use and maintain your scanner properly How you use and maintain your scanner can make a difference in the quality of document recognition. Follow these tips: ◆ Know your scanner. Read and understand all documentation that came with your scanner. ◆ Maintain the scanner. Keep your scanner clean and dust-free. Keep your scanner’s glass platen (flatbed) free of dirt or marks that might be captured during scanning. ◆ Load the scanner correctly. Make sure your document is not scanned at an angle.
Adjust scanner brightness During scanning, one of the most important settings affecting successful character recognition is scanner brightness. As Figure 6–2 illustrates, the original documents you scan may vary considerably. Smudged type: increase brightness Type on dark background: increase brightness Lighter, thinner type: decrease brightness Figure 6–2.
Try Lighter Image if characters on your page appear too bold, are starting to fill in or are touching, or words are separated by very small spaces (as in some magazines). Recognition of documents with background noise, or with screened or colored backgrounds, can improve considerably by increasing the brightness setting. Try Darker Image if characters on the page appear faint, broken, or very thin.
Note Adjust for colors TextBridge Pro supports many popular desktop scanners. Each device has a software driver that enables TextBridge Pro to run with it. Because scanner drivers differ from one another, TextBridge Pro’s brightness settings can be reversed. So, selecting Lighter for your scanner can actually produce a darker image; conversely, selecting Darker can produce a lighter image. All scanners have one or more colors that they do not read. These are called drop-out colors.
Low-resolution fax images can be difficult to recognize Figure 6–5. Fax image To recognize fax images, TextBridge Pro provides a Fax setting on the Preferences Panel and in the Recognize menu. This document quality filter initiates a pre-processing step that enhances the fax image before OCR begins. The Fax switch works on fax images stored in image files and on scanned hard copy faxes. TextBridge Pro automatically uses the fax filter when the image is less than 225 dpi.
However, you can compromise this learning capability by processing pages of different documents to the same output file. TextBridge Pro expects the second and successive pages of a document to use the same fonts it recognized on the first page. If the second page is a totally different document, with different typefaces and point sizes, the knowledge that TextBridge Pro gained for the first page becomes invalid. TextBridge Pro must begin the learning process over again for the second (and successive) pages.
Create a custom dictionary For each language that it supports, TextBridge Pro has a system dictionary that contains approximately 50,000 common words for that language. This helps TextBridge Pro with character recognition, as the program considers each character in the context of a word in the system dictionary. For a job that contains many special terms (for example, a legal document, technical manual, and so on), you may want to create a custom dictionary.
As you type in your list of words, make sure of the following: ◆ Words consist of characters from the International Standards Organization (ISO) set except a space. ◆ Each word is entered in lower case, except for words that are typically capitalized or in all capital letters. When a word begins with a capital letter, TextBridge Pro tends to assume that it is a proper noun. ◆ Each word contains one or more characters. ◆ The entire file does not exceed 10,000 words.
Train TextBridge Pro for the highest accuracy Figure 6–7. Interactive training To attain the highest level of accuracy, use the Training Level pop-up menu on the Training toolbar: Select Most Words to train on virtually every suspect word. Generally, with a multiple-page document, train the program on one or two pages. End training by clicking the Train OCR icon on the main toolbar so it is no longer pressed in.
Use Smart Zones™ In general, TextBridge Pro does an excellent job of analyzing page images and separating the text areas from the graphic areas. This is particularly true when the graphics on a page image are halftones. Halftones are photographs and other artwork reproduced through screens. These screens render the image in dots of various sizes, so the image can be printed. With line art, however, TextBridge Pro has more difficulty.
TIPS FOR EFFICIENT PROCESSING When you first use TextBridge Pro, you may find it easiest to process documents without adjusting the default preferences. In many cases, using default preferences provides good results. However, if you want to get the best performance from TextBridge Pro, fine-tuning preferences and using other efficiency features are recommended.
For every document, you can specify one of three general types of input layout for the input document to be processed. You can also select one of four output layout modes for the output document. You can define a specific document quality and page orientation as well. Note These four pop-up menus represent the most commonly used and modified preferences in TextBridge Pro.
Output layout The Output Layout settings control the look of the output document. ◆ The Text: One column setting, the default, outputs text only, in straight galley (single-column) format. This setting works with all output formats available in the Save dialog box, and requires the least amount of processing time. ◆ The Text and pictures: One column setting outputs text in galley format.
Original quality Use the Original Quality setting to inform TextBridge Pro about the origin of the document, or to instruct the program to automatically determine it. ◆ Use the Normal setting for original documents that were laserprinted, typeset, typewritten, impact-printed, or ink-jet printed, or are photocopies of these document types. This setting takes the least amount of processing time.
Page orientation The Page orientation setting controls whether TextBridge Pro attempts to rotate the scanned or on-line page image before it processes it. ◆ For most common documents, you can specify the Portrait setting, the default. ◆ For some documents, such as spreadsheets, where print runs across the length of the page, specify the Landscape setting.
Zone to capture only the data you want Some documents display logos, graphics, running headers, and other matter that you do not need to capture, and which would slow down the recognition process. With the zoning tools in TextBridge Pro’s preview mode, you can specify which areas to ignore or you can specify only the text and images that you want to capture (Figure 6–10). Note Creating any text zones will disable document recomposition. Refer to Chapter 4 for information about using preview tools.
Save and load zone templates To process similar documents, TextBridge Pro provides the means to save and load zone templates. In preview mode, when you create a set of zones (text, image, ignore or all), you can save this zone set to a template file in the TextBridge® Pouch. Later, when you want to process the same type of document, you can load the template file, thus avoiding having to re-create the zones. After creating a set of zones, click the Save Zone Template button on the preview toolbar.
You can select the zone template before processing when TextBridge Pro is in ready mode, or you can specify preview mode in TextBridge Pro, then display each page of the document in the view area of the main window before choosing a template. Displaying the page in preview before selecting a template enables you to see the entire zone set on the previewed page. For complete information about working in preview mode, refer to Chapter 4.
The next time you want to process the same type of document, you can load the training file. This enables you to gain accuracy improvements without having to retrain TextBridge Pro. To load a training file, pull down the Training Data pop-up menu in the Preferences Panel on the main window and select the training file (Figure 6–14). Select the training file Figure 6–14.
During the scanning phase, you can be present to monitor and load pages into the scanner. After all pages have been scanned, you can queue up the image files in TextBridge Pro’s Image Queue dialog box (Figure 6–15) and attend to other business (or go home) while TextBridge Pro performs recognition. For complete information about deferred processing, refer to “Scanning pages for deferred processing” and “Recognizing and converting image files” in Chapter 4.
A TROUBLESHOOTING AND ERROR CORRECTION TextBridge Pro is designed to be easy to install and use, and under typical circumstances, you should rarely experience problems. However, should you encounter a problem during installation or use of TextBridge Pro, first consult this appendix to try to resolve the problem yourself. TextBridge Pro error messages appear in a standard Macintosh alert box, as shown in Figure A–1. Note the error message and the error number Click OK, then correct the error Figure A–1.
WHAT TO DO IF YOU HAVE A PROBLEM If you run into a problem, refer to the “Correcting Error Conditions” section in this chapter to locate the error, then follow the recommended solution. When you get an error message, write down the text of the message and the error number. Also, note the sequence of steps you took to generate the message. This information will be useful later if you cannot solve the problem and must contact us.
ERROR MESSAGES AND POSSIBLE SOLUTIONS Occasionally, during TextBridge Pro operation, you can receive an error message. TextBridge Pro error messages are designed to be self-explanatory. Usually, you can simply correct the situation and proceed. However, if you require more detail about how to correct an error condition, consult this section. Each error message is listed by error number along with a description of the cause and a recommended course of action.
–45 The output file is locked. Use the Get Info command to unlock the file. You are attempting to write a file to a locked disk or file name. Eject the disk and unlock it if necessary or use the Get Info command to unlock the file and try again. –46 The output file is on a locked volume. See –45. –47 The file is in use by someone else. This indicates that another application or another user on the network is attempting to access the same file.
–151 There is not enough system memory available. This message can also indicate that other applications are using memory or that more RAM is required. Quit all applications and restart your system. Then restart TextBridge Pro and try again. –192 A required resource is missing. Either a resource critical to the operation of TextBridge Pro is missing, or there may be a conflict caused by an extension. To determine whether there is a conflict, restart your Mac while holding down the Shift key.
1004 The Thread Manager is not installed. TextBridge Pro requires this piece of system software to run. The Thread Manager was inadvertently deleted from the Extensions folder. Reinstall TextBridge Pro from the original TextBridge CD-ROM. Refer to Chapter 2 of this manual for information about installing TextBridge Pro. 1005 TextBridge Pro requires System 7.1 or later. You must upgrade your system to run this version of TextBridge Pro.
1008 TextBridge Pro is unable to create or open the TextBridge® Pro Preferences file. The file may be locked, or on a locked volume, or may be damaged. This error can occur if the existing preferences file is locked or if the system disk is locked. Use the Get Info command to unlock the file or volume. 1009 TextBridge Pro found an older TextBridge® Pro Preferences file. It may not have all the settings and preferences that you expect. Drag the TextBridge Pro Preferences folder to the trash and try again.
1021 Either no language packs have been installed, or the “TextBridge® Pouch” folder is missing or damaged. The TextBridge Pouch must be in the same folder as the TextBridge Pro application. Move the pouch to the correct location or reinstall TextBridge Pro from the CD-ROM. At least one language pack must be installed. If you have moved the TextBridge Pro application to a new location, such as the desktop, move the TextBridge® Pouch there as well.
1025 The selected Training data file contains invalid data. You have attempted to load a training file and something is wrong with the file. Try selecting another training data file or create a new one. 1026 The selected zone template file contains invalid data. You have attempted to load a zone template file and something is wrong with the file. Try selecting another zone template file. 1031 The user canceled TextBridge Pro processing from the user interface.
1035 TextBridge Pro cannot stop performing OCR because the process has not started. While running TextBridge Pro from AppleScript, the script tried to stop processing when no OCR job was in progress. 1036 TextBridge Pro cannot cancel a page because no page is being recognized currently. While running TextBridge Pro from AppleScript, the script tried to cancel the current page when no page was being recognized. 1037 TextBridge Pro does not have enough memory to display an image.
1051 TextBridge Pro cannot invert an image because TextBridge Pro is not in Preview mode. While running TextBridge Pro from AppleScript, the script tried to invert an image when TextBridge Pro was not in preview mode. This capability is only available in preview mode. 1052 TextBridge Pro cannot deskew an image because TextBridge Pro is not in Preview mode. While running TextBridge Pro from AppleScript, the script tried to deskew an image when TextBridge Pro was not in preview mode.
1056 TextBridge Pro cannot Save Training because no training data is currently available. While running TextBridge Pro from AppleScript, the script tried to save training data when no interactive training had occurred. Save training is only available after training OCR for the current document. 1066 An error occurred while reading a TIFF file. A system error may have occurred while reading a TIFF file. Stop processing and try again.
TextBridge Pro can read the following TIFF variations: TIFF Uncompressed (Intel header) TIFF CCITT-3 (Intel header) TIFF CCITT-4 (Intel header) TIFF Uncompressed (Motorola header) TIFF CCITT-3 (Motorola header) TIFF CCITT-4 (Motorola header) TIFF Packbits (Motorola header) 1070 None of the files you are attempting to open for OCR are TIFF or PICT files. TextBridge Pro only accepts image files in TIFF or PICT format. TextBridge Pro accepts standard Version 1 and Version 2 PICT files.
1076 A fatal scanner error occurred. Quit and restart TextBridge Pro. Turn the scanner off and on again. Then, quit and restart TextBridge Pro. 1077 An error occurred while scanning. Check your scanner; try changing the resolution and try again. Also, refer to “Scan Problems” in this chapter for troubleshooting tips. 1078 There is no scanner driver installed or selected. Install and/or select a scanner driver for your scanner. TextBridge Pro cannot find a scanner driver.
If you are using a Chooser extension, make sure you successfully selected the driver in the Chooser. Refer to Chapter 2 of this manual for information about installing and selecting scanner drivers. Refer to “Scan Problems” in this chapter for troubleshooting tips. 1079 There is no scanner available because either none is selected or no scanner device is connected.
1082 TextBridge Pro could not open the scanner. Check to see that the scanner is on and ready and that sufficient memory is allocated to TextBridge Pro. Otherwise, the selected TWAIN source cannot drive the attached scanner(s). Make sure that the scanner is powered on and ready. Quit TextBridge Pro and use the Get Info command to make sure that enough memory is allocated to TextBridge Pro. Some TWAIN sources require more memory than others to acquire an image. 1083 The scanner driver is out of memory.
1087 The scanner's image settings are invalid. Check the resolution and page size settings. The scanner is preparing to transfer an invalid image. This message most likely occurs when the image resolution, image height or width is set to a value equal to or less than zero. Check that the scanner settings are valid. 1088 The scanner stopped because its out of memory buffers. 1089 The scanner has run out of memory.
1096 The selected TWAIN source does not support a required operation triplet. Refer to error 1093. 1097 The selected TWAIN source returned data value outside valid range. Refer to error 1093. 1098 1099 The selected TWAIN source received a message out of the expected sequence. Refer to error 1093. A TWAIN operation was only partially successful. Refer to error 1093. 1100 The selected TWAIN source could not be enabled. Refer to error 1093. 1101 The selected TWAIN source could not be found.
1102 The selected TWAIN source is transferring an image that is not black and white. Optimum recognition accuracy requires a black and white image. This error indicates that the selected TWAIN source is configured to transfer an image that is not black and white. In the TWAIN dialog box make sure to specify a black and white image. 1103 The selected TWAIN source is transferring an image with an unsupported resolution. Optimum recognition accuracy requires an image resolution of 200, 300, or 400 DPI.
1108 There is not enough memory to perform image acquisition or OCR on this page. You have directed TextBridge Pro to read a large or highresolution image. Reducing scanner resolution to create a lower resolution image will reduce memory requirements. Also, in some cases, creating manual zones in preview mode rather that using the Automatic Input Layout feature can reduce the OCR memory requirements.
1116 An unknown error occurred during document conversion. Quit TextBridge Pro; restart your Macintosh; then restart TextBridge Pro and try again. Try a different output format or different recomposition options. 1117 TextBridge Pro cannot locate the specified text conversion format. The text conversion file has been deleted or moved. Custom Install from the TextBridge Pro CD-ROM to install the text conversions. TextBridge Pro cannot find the selected conversion file.
INSTALLATION PROBLEMS Basic Troubleshooting for Installation Problems Try these basic troubleshooting tips when you have problems with installation. Bad or unreadable CD-ROM If you can read your CD-ROM but cannot install from it, refer to the ReadMe–Support file on the CD-ROM. If you cannot read the CD-ROM at all, contact one of the following to get a replacement CD-ROM: • If you purchased TextBridge Pro from an authorized ScanSoft reseller, contact the reseller.
SCAN PROBLEMS Basic Troubleshooting for Scan Problems Try these basic troubleshooting tips when you have problems scanning. • Make sure the scanner works with the manufacturer’s software before trying it with TextBridge Pro software. • Make sure the scanner is powered on and ready. • Make sure the scanner is connected to the computer’s SCSI interface. • Make sure the scanner and all other connected SCSI devices are turned on. • Turn the scanner power off, then turn it on again.
• Try a “cold” restart. Shut down the computer. Power down all SCSI devices and the computer. Power the SCSI devices up. Turn on the power to the computer. • Try allocating more memory to TextBridge Pro. Quit TextBridge Pro. Select the TextBridge Pro application icon in the Finder. Select Get Info from the File menu. Increase the value in the Minimum Size field of the Memory Requirements section of the window. Click on the close box in the upper left corner to close the window and save the changes.
TWAIN Sources option disabled in Select Source dialog. • Make sure that the TWAIN folder is installed in the Preferences folder in the System Folder. • Make sure that the Source Manager is installed in the TWAIN folder. • Make sure there is an appropriate TWAIN data source in the TWAIN folder. Desired TWAIN driver does not appear in the Select Source dialog • Make sure there is an appropriate TWAIN data source in the TWAIN folder.
Assorted scanner driver problems Scanning or scanner test software was supplied by the scanner manufacturer • Try scanning using the software supplied by the scanner manufacturer. • If you can scan using the software supplied by the scanner manufacturer, the problem probably is not with the SCSI cabling or hardware.
Scan problems after Clicking GO See error messages 1076, 1077, 1081, 1083, 1106, 1107, 1108, and 1120 for explanations and possible solutions. Scanning or scanner test software supplied by the scanner manufacturer. • Try scanning using the software supplied by the scanner manufacturer. • If you can scan using that software, the problem probably is not with the SCSI cabling or hardware. Scanner uses sheet feeder even when check box is not checked • Power down scanner and disconnect ADF cable.
Quality of scanned image is poor • Check settings; 300 dpi resolution usually works best with most documents. For 8 points or smaller, 400 dpi is recommended. TextBridge Pro hangs or crashes when attempting to scan • Refer to Basic Troubleshooting Steps for crashes in this chapter. Doesn’t scan, but TextBridge Pro says it did Ofoto is installed: • Restart Mac and Scanner. • Start Ofoto. • Make the Finder the active application. • Start TextBridge Pro. • Select Source. • Quit Ofoto.
• Allocate more memory to TextBridge Pro: a) Quit TextBridge Pro. b) Select the TextBridge Pro application icon in the Finder. c) Select Get Info from the File menu. d) Increase the value in the Minimum Size field of the Memory Requirements section of the window. e) Click on the close box in the upper left corner to close the window and save the changes. f) Start TextBridge Pro. CRASH PROBLEMS Basic Troubleshooting for Crashes 1. Shut down your Macintosh. 2.
TextBridge-Specific Crash Problems New scanner configuration or setup or just changed or disconnected a scanner • Drag the Scanner Settings file (if it exists) from the System Folder into the Trash. • Drag the TextBridge Pro Preferences file from the Preferences folder (in the System Folder) into the Trash. • Restart TextBridge Pro. Using Photolook v2.07.2 displays message: "FL driver 1.25 is incorrectly installed. Please reinstall -43” • Remove Photolook v2.07.
Instant Access OCR Problems Error type -50 on attempt to run Possible conflict with Object Support Library version • Use the Get Info command on ObjectSupportLib; if it is installed it is located in the Extensions folder. • Apple TECHNOTE 1095 says there are many problems with early versions of this library. • Apple recommends version 1.2 for all users. It is supposed to fix all known crashing bugs. • Install the update found at the Apple Web site at www.apple.com.
Using software with probable conflict • Now Folder Menus • Now Menus • Now Start-up Manager • Now WYSIWYG Menus Using software with possible conflict • Desktop Reset • Dial Assist • Kodak Precision CP • Okey Dokey • PaperPort • PaperPort Extension • Penworks • Super Cache Not sure if you are using software with a conflict • Restart with non-system extensions by restarting the computer while holding down the shift key.
Recently installed new software • Try removing new software. • Check with third party support to see if there are known problems. Using font management software such as Adobe Type Manager (ATM) or Adobe Type Reunion • These have been known to cause problems; try turning the software off or removing any extensions associated with the software and rebooting the system.
Symptom of possible virus Possible virus • Run a virus scan application such as Disinfectant (free and effective). Symptom of possible hardware problem • SCSI problem Check for the problem in “Scan problems” in this chapter • Other (Sad Mac error codes, strange chords at startup, etc.) Run MacCheck, Disk First Aid, Apple HD SC Setup or other diagnostic software. (Some of these come bundled with the Macintosh; others may come bundled with scanners or other SCSI devices.
B SAMPLE DOCUMENTS The tutorials sessions in Chapter 5 of this manual refer to on-line sample documents. These sample documents are designed to be used with the tutorial sessions, and to highlight some of the more important features of TextBridge Professional Edition. When you install TextBridge Pro (refer to Chapter 2), the default installation folder is the TextBridge Pro Folder. Within this folder are a number of others including the Sample Documents folder.
MARKPLAN Tutorial Session 1 in Chapter 5 uses a typical one-column office document named: markplan The document uses serif fonts in several different sizes and styles, and includes bullet (•) characters, all of which TextBridge Pro can recognize and output. Figure B–1 shows a scaled-down picture of markplan. Figure B–1.
ZONEPIC Tutorial Session 2 in Chapter 5 uses a multiple-column newsletter-style document: zonepic The document is designed to illustrate TextBridge Pro’s manual zoning features, with which you can identify specific areas (text and graphics) of pages to capture. Figure B–2 shows a scaled-down picture of zonepic. Figure B–2.
PLEXIS Tutorial Session 3 in Chapter 5 uses a fax-quality document named: plexis The degraded image quality of fax documents is ideal to illustrate the interactive training feature of TextBridge Pro. Figure B–3 shows a scaled-down picture of plexis. Figure B–3.
RESUME Tutorial Session 4 in Chapter 5 uses a fictitious resumé named: resume The document is typical of the type of data you might like to pour directly into your word processor for immediate editing purposes. It is designed to illustrate TextBridge Pro’s Instant Access OCR feature. Figure B–4 shows a scaled-down picture of resume. Figure B–4.
3COL Tutorial Session 4 in Chapter 5 uses a three-column document with a picture in the middle column. It is named: 3col The document is designed to illustrate TextBridge Pro’s powerful recomposition capabilities. Figure B–5 shows a scaled-down picture of 3col. Figure B–5.
C APPLESCRIPT INTERFACE In addition to a graphical user interface, TextBridge Pro supports an AppleScript™ interface. AppleScript is Apple Computer’s system-software-level scripting system for the Macintosh. With AppleScript you can run TextBridge from scripts without using the keyboard or mouse. Thus, you can automate repetitive tasks, such as detecting and recognizing fax files, as they are received on your system.
WRITING TEXTBRIDGE SCRIPTS If you can run TextBridge Pro from the Macintosh Finder, you can write a TextBridge Pro script. To create your own TextBridge Pro scripts, use the Script Editor provided as part of the AppleScript utilities. Note Scansoft provides the AppleScript extension as part of the Standard and Basic installations; the Script Editor is not included, however, and must be acquired from Apple.
2. Run TextBridge Pro. If you have not already done so, double-click the TextBridge Professional icon to start the application. Load the scanner, if necessary, and then select preferences and click Go as you would normally. As TextBridge Pro processing occurs, the corresponding Apple events appear in the Script Editor window (Figure C–1). Type a brief description The Script Editor records corresponding Apple events While TextBridge Pro is running Figure C–1.
3. Click Stop in the Script Editor window. The Script Editor stops recording and ends the script. From here, you can make changes to the script, check the syntax, or run the script from the editor. ☞ For detailed information about AppleScript and the Script Editor, refer to your Apple Computer, Inc. documentation. Refer to “Sample Scripts” as well as the information in “TextBridge Pro Objects” and “TextBridge Pro Commands.
TEXTBRIDGE PRO OBJECTS This section describes the TextBridge Pro object classes: application ◆ application ◆ docFormatter ◆ imageSource ◆ recognizer ◆ TBDictionary ◆ TBLanguage ◆ TBTrainingData ◆ TBZones ◆ Zone The application.
TBLanguage by numeric index, by name, by ID TBTrainingData by numeric index, by name TBZones by numeric index, by name TBDictionary by numeric index, by name Commands handled. Get, Set Example tell application "TextBridge Professional" activate set useFileInput of recognizer 1 to true docFormatter This is the conversion for text output from OCR to a document. Specify by name, ID or index. Properties name The name of the text output format.
Element classes None Commands handled. Get, Set Example set curFormat of recognizer 1 to docFormatter ¬ "Microsoft Word (RTF)" imageSource The source of the image to be recognized by TextBridge Pro. Properties name The name of the image source. Only available for scanners. Object class: string Modifiable? No IsScanner If true, specifies that the image source is a scanner; if false, the source is a file list. Object class: Boolean Modifiable? No fileList List of image files to be recognized.
unattendedOperation Normally false; when true, suppresses scanner-related dialog boxes requesting additional pages. Automatically reset to false at the end of each job. Object class: Boolean Modifiable? Yes hasSheetFeeder Indicates whether a sheet feeder is available.
Element classes None Commands handled.
Recognizer This is the controller for the recognition process. In version 3.0, only one recognizer exists.
useFileInput Instructs TextBridge Pro to obtain page images from on-line image files. Object class: Boolean Modifiable? Yes doInstantAccess Instructs TextBridge Pro to place the recognized document on the clipboard instead of creating a file. Automatically set to false at the end of each job. Object class: Boolean Modifiable? Yes clipboardLoaded Indicates that Instant Access OCR has loaded the clipboard in the current or latest job.
originalQuality Specifies the anticipated quality of input images: normalQuality faxQuality dotMatrixQuality automaticQuality Object class: Enumerated Modifiable? Yes pageOrientation Specifies the anticipated orientation of input images: portrait landscape automaticOrient Object class: Enumerated Modifiable? Yes inputLayout Specifies the anticipated layout of input images: oneColTextIn oneColTextAndPicturesIn automaticInputLayout Object class: Enumerated Modifiable? Yes outputLayout Specifies the req
curSource The source of page images. Object class: imageSource Modifiable? Yes curDocument The document to be created by the current or next job. Object class: File System Spec Modifiable? Yes curFormat The format of the document to be created by the current (or next) job. Object class: docFormatter Modifiable? Yes curLanguage The language of the document being recognized. Object class: TBLanguage Modifiable? Yes curDictionary The custom dictionary to be used.
zoneList The list of Text, Image and Ignore zones to be applied during recognition. Object class: TBZones Modifiable? Yes autoSaveZones Normally false. Indicates whether to display a dialog box asking if you want to save zones at end of job (if they have changed). Object class: Boolean Modifiable? Yes autoSaveTraining Normally true. Indicates whether to display a dialog box asking if you want to save training data at end of job (if training has taken place).
verifyWordRect The rectangle giving location of verifyWord during Training. Object class: QuickDraw rectangle. Modifiable? No Element classes Zone by numeric index. Commands handled Set, Synchronize, StartJob, ContinueJob, StopJob, CancelPage, SaveZones, SaveTraining, InvertImage, DeskewImage, Rescan Example Synchronize recognizer 1 desiredStatus ¬ statusPreviewing TBDictionary A custom dictionary. Properties name The name of the dictionary. Refer to the dictionary names in the TextBridge® Pouch.
Element classes None Commands handled. Get, Set Example set curDictionary of recognizer 1 to TBDictionary ¬ "Lancet Dictionary" TBLanguage The language of the document to be recognized. Specify by name, ID, or index. Properties name The name of the language. Refer to the language pack names in the TextBridge® Pouch.
Commands handled. Get, Set Example set curLanguage of recognizer 1 to TBLanguage ¬ "English" TBTrainingData A record of training data. Specify by name or index. Properties name The name of a training data file. Refer to the training data file names in the TextBridge® Pouch. Object class: string Modifiable? No Element classes None Commands handled.
TBZones A zone template. Specify by name or index. Properties name The name of the zone template. Refer to the TextBridge® Pouch for zone template names. Object class: string Modifiable? No Element classes None Commands handled.
Zone A zone for OCR. Properties zoneOutputOrder The order in which the zone will be output to the document (1 first). Object class: integer Modifiable? Yes zoneType The type of zone: TextZone ImageZone IgnoreZone Object class: Enumerated Modifiable? No Element classes None Commands handled.
TEXTBRIDGE PRO COMMANDS This section describes the commands that the TextBridge Pro recognizer understands. CancelPage Cancel recognition of current page but continue current OCR job.
ContinueJob Resume a job that has paused during preview or training. Command syntax ContinueJob recognizer Parameters None Result None Example ContinueJob recognizer 1 DeskewImage Deskew the current page image. This command is only available during preview.
Result None Example DeskewImage recognizer 1 InvertImage Invert the current page image. This command is only available during preview.
Rescan Rescan the current page. Command syntax Rescan recognizer Parameters None Result None Example Rescan recognizer 1 SaveTraining Save current training data to file in TextBridge® Pouch. Command syntax SaveTraining recognizer fileName Parameters fileName AppleScript Interface Name of the new training data file.
Result None Example SaveTraining recognizer 1 fileName "Caboose" SaveZones Save current zone list to zone template file in TextBridge® Pouch. Command syntax SaveZones recognizer fileName Parameters fileName Name of the new zone template file.
StartJob Start recognizing a document. Command syntax StartJob recognizer Parameters None Result None Example StartJob recognizer 1 StopJob Stop the current job. Command syntax StopJob recognizer [discardData Boolean] Parameters discardData Indicates whether to discard any pages already processed. The default is false.
Synchronize Synchronize with the recognizer at the indicated status. This event, when received, is returned when the indicated status (or Ready or Quitting) is achieved.
GLOSSARY OF TERMS This glossary defines terms and concepts used in this manual. For readers who are new to scanning and character recognition concepts, this glossary may be useful not only as a reference, but also as a primer on optical character recognition technology. Some definitions provided here contain terms in bold letters. This means that these terms are also defined elsewhere in the glossary.
auto-segmentation—In TextBridge Pro, a capability to discern the layout of the page image, and to recognize and output text regions in the correct order. For example, in a newsletter, where columns are of uneven depths and widths, TextBridge can discern the layout of the page and recognize and output text in the correct sequence. In TextBridge Pro preferences, you can specify autosegmentation by selecting Automatic as the input layout.
confidence threshold—A numerical value built into the TextBridge Pro OCR engine. During recognition, TextBridge Pro assigns a confidence number, based on a number of factors, to each recognized word. If the assigned number falls below the confidence threshold, the word is flagged as a questionable, or suspect, word. If the number rises above the confidence threshold, then TextBridge Pro releases the word as correctly recognized and bases future recognition decisions on correctly recognized words.
F fax image—The representation of a page as binary data (usually 200-by-100 or 200-by-200 dpi resolution) transmitted or received by a facsimile (fax) machine or fax modem. Computers with fax modems can receive a fax image and store it on-line as an image file. TextBridge Pro can open and recognize the text from an on-line fax image, provided that it is in TIFF format.
I Instant Access OCR™—A capability that enables TextBridge Pro to be run from within virtually any word processor or text application. When you launch TextBridge Pro from the Apple menu, while the host application is active, the TextBridge Pro Main window is displayed, and you can access all TextBridge Pro features to process a document. When document processing is finished, TextBridge Pro automatically pastes the recognized text into the host application’s open document.
L landscape orientation—Describes a page which is rotated 90 degrees, and on which lines of print usually flow across the wider dimension (or length) of the page. This is opposed to portrait pages, on which lines of print flow across the more narrow dimension (or width) of the page. For example, many office documents are printed on 8.5-by-11 inch paper. With landscape orientation, the page is rotated 90 degrees and print flows across the 11-inch dimension of the page.
P page image—A binary (black and white) representation of a page stored in computer memory, or on disk in an image file. TextBridge Pro acquires a page image from a scanner or an image file (TIFF, PICT) and then begins optical character recognition. page orientation—In TextBridge Pro preferences, a category of settings that inform the program about the placement—that is, the orientation—of print on a page. The available settings are: Auto; Portrait; and Landscape.
R recognition—The TextBridge Pro process during which a scanned or on-line TIFF page image is analyzed, and characters and words are derived from the image and saved as a text data stream in memory or in an on-line temporary file. Sometimes referred to as optical character recognition (OCR).
S Save Page Image – Defer OCR—A capability of TextBridge Pro to store a binary (black and white) picture of each page it scans for later document recognition. The page images are saved in TIFF or PICT format. No OCR is performed during the process. Later, you can queue up these images in the Image Queue dialog box and have TextBridge Pro run document recognition on them. Alternatively, you can use these page images in an image editing or document image management application, or for any other purpose.
T text box—In Macintosh dialog boxes, a rectangular field into which you can type new text (for example, a file name) or make changes to ("edit") existing text. For example, in the TextBridge Pro training toolbar, suspect words appear in a text box for your acceptance or correction. It is a standard text box, in that you can select a point inside the text string to type new text; alternatively, you can highlight part or all of the text string and type over it. You can cut and paste text, and so on.
Z zone—In the TextBridge Pro view area, a rectangular border that you can draw around a portion of the displayed page image to define the area of the page to process. TextBridge Pro enables you to create text zones, image zones, an ignore zones to define areas of the page image. zone template—A set of zones created in preview mode and saved to a template file.
INDEX A A4 page size, 4–10 Accept button, 3–10, 4–30, 5–14 Add More Pages dialog box, 4–14, 4–16, 4–19, 4–27, 4–31 Adobe Photoshop Import Plug-in user interface, 3–14, 3–16, 3–19, 3–24, 3–29 Adobe Photoshop Import Plug-ins, 1–3, 1–8, 2–2 location of, 2–11 Apple events CancelPage, C–20 ContinueJob, C–21 DeskewImage, C–21 InvertImage, C–22 Rescan, C–23 SaveTraining, C–23 SaveZones, C–24 StartJob, C–25 StopJob, C–25 Synchronize, C–26 Apple Guide for TextBridge Pro, 1–9 Apple Menu Items folder Instant Access O
ASCII formats, 1–7, 5–16 Assorted scanner driver problems, A–26 Auto-brightness, 6–4 Automating repetitive tasks, C–1 B Basic Troubleshooting for Crashes, A–29 Basic Troubleshooting for Scan Problem, A–23 Brightness, 4–9 use preview mode to evaluate, 6–4 using to improve recognition accuracy, 6–3 when to increase or decrease, 6–4 Brightness command, 3–29 Brightness, auto, 6–4 Button behavior, 3–3 C Cancel Page button, 3–6, 21 Cancel Page command, 3–21 Cell tables, 5–19 Character recognition compensating
Commands (cont.
Commands (cont.
Dialog boxes (cont.
Drop-out color, 6–5 Duplex documents, 1–5 Dynamic training, 1–3 E Edit menu, 3–15 Edit Zone button, 3–7, 4–26 Electronic registation, 2–10 Error messages, A–1, A–3 Events, TextBridge Pro, C–20 Expected scan options are grayed out—(TWAIN or Adobe Photoshop Import Plug-in), A–26 F Fax documents, 1–6, 4–5 Fax image, 6–6, 4–21 synthesized, 6–6 Fax modem, 4–20, 2–3 Fax setting, 6–6, 6–15 when not to use, 6–6 File menu, 3–12 Final Check for Unresolved Problem, A–34 Folder, adding to image queue, 4–22 Formats,
Image formats supported by TextBridge Pro, 4–20 Image queue adding file to, 4–22 order of files in, 4–23 removing files from, 4–22 Image Queue dialog box, 4–21, 4–24, 4–29, 5–5, 5–8, 5–13, 5–16, 5–18, 6–21 Image source, specifying, 4–24, 4–28 Image Zone tool, 5–10 Image zone, 3–26 creating, 5–10 imagSource object, C–7 Input From File specifying with deferred processing, 4–17 Input From File button, 3–5, 3–13, 4–21, 4–24, 4–28, 5–4, 5–5, 5–7, 5–12, 5–16, 5–17 Input From File command, 3–13 Input From Scanner
Instant Access OCR™ (cont.
M Main toolbar, 3–2, 3–4, 5–2 Main window, 1–1, 3–1, 4–25 features of, 3–2 in interactive training mode, 5–14 in preview mode, 5–9 Instant Access mode 5–15 view area, 5–3, 13 Memory requirements, 2–2 Menu bar, 5–2 Menus Edit, 3–15 File, 3–12 Process, 3–20 Recognize, 3–22 Scanner, 3–28 View, 3–18 Menus and commands, 3–12 Microsoft Word (RTF) and other applications, 1–7 More command, 3–30 Move To Back command, 3–17, 4–27 Move To Front command, 3–17, 4–27 N New folder, 4–19 Noise, 4–31, 6–3 O OCR compensa
Optional characters entering in training mode, 4–30 Original Quality tips for using, 6–15 Original Quality command, 3–23 Original quality setting, 5–12 Original Quality, 4–5, 4–21 Output formats text, 1–7 Output Layout, 4–4 tips for using, 6–14 Output Layout command, 3–23 Output layout setting, 5–17 P Page image, 5–13 specifying format of, 4–19 zooming, 4–24 Page images, 3–13, 4–17, 5–3 naming files, 4–18 origin of, 4–20 performing recognition on, 4–20 resolutions of, 4–20 Page Orientation Automatic, 4–21
PICT, 1–4, 3–13, 4–20 Plug-ins, 1–8 user interface and scanner settings, 3–29 Pop-up menus Training Data, 6–20 Training Level, 4–30, 6–10 Zone Template, 6–18 Portrait orientation, 4–6 Preference panel adjusting scanner brightness, 6–3 Preferences, 3–22 displaying, 3–11 scanner settings, 6–3 specifying, 4–1 when to set, 4–2 Preferences panel, 3–2, 3–11, 5–2, 5–7, 5–12, 5–17 expanding, 4–2 expanding, 5–2 Preferences view bar, 3–11, 4–2, 5–2 Preview button, 3–5 Preview button, 4–24 Preview button, 5–7 Preview
Preview toolbar, 3–6 Preview toolbar, 4–25 Preview toolbar, 5–2, 3–3 Preview tools, 1–4 Preview tools, 5–7 Previewing pages, 4–23 Problems not limited to TextBridge Pro or Instant Access OCR, A–33 Problems using a TWAIN driver, A–24 Process menu, 3–20 Processing image files created by other applications, A–30 Q Quit command, 3–15 R ReadMe, 1–7 ReadMe–Support, 3–2 Recognition Language command, 3–24 Recognition language, 3–11 Recognition language, 6–8 Recognition language, specifying, 4–6 Recognize menu,
Save dialog box, 3–14, 4–18, 5–5, 5–6, 5–8 Save Page Image – Defer OCR button, 3–5, 3–13, 4–18 and Instant Access OCR, 4–34 Save Training Data command, 3–27, 4–32 Save Training Data dialog box, 3–27, 4–31, 5–14, 6–19 Save Zone Template button, 3–8, 6–18 Save Zone Template command, 3–26, 4–27 Save Zone Template dialog box, 3–26, 6–18 Scan problems after Clicking GO, A–27 Scan Problems and possible solutions, A–23 Scanner, 1–3 drop-out color, 6–5 loading pages in, 6–2 maintenance and proper use, 6–2 Scanner b
Scanning preferences, specifying, 4–9 Script Editor, C–2 Scripts for automating repetitive tasks, 1–4, C–1 Select All command, 3–17 Select Source command, 3–28 Sheet Feeder command, 3–30 Sheet Feeder, 4–11 Smart Zones™, 1–3, 1–4, 4–26 using to improve recognition, 6–11 when to use, 6–14 Starting TextBridge Pro, 2–10 State buttons, 3–3 Stop button, 3–6, 3–21 and Instant Access OCR, 4–34 Stop command, 3–21 Submenus Custom Dictionary, 3–24 Input Layout, 3–23 Original Quality, 3–23 Output Layout, 3–23 Page Orie
T TBDictionary object, C–15 TBLanguage object, C–16 TBTrainingData object, C–17 TBZones object, C–18 Template file, 3–26 Text editor use to create custom dictionary, 6–8 Text formats, 1–4, 7 Text output format selecting, 5–8, 18 Text output format, selecting, 5–6 Text Zone tool, 5–9 Text zone, 4–26, 6–17 creating, 5–9 TextBridge Instant Access OCR™ host application limitations, 4–34 problems, A–31 procedure for using, 4–32 TextBridge Pro Apple events, C–20 Apple Guide help, 1–9 AppleScript, C–1 ASCII forma
TextBridge Pro (cont.
TextBridge Pro (cont.
TextBridge Professional Edition capturing parts of a document with, 5–7 de-installing, 2–13 document recomposition, 5–17 error messages, A–1, A–3 fax images with, 2–3 features, 1–1 image file, procedure for selecting, 5–4 installation, 2–1 installing and testing software, 2–3 Instant Access OCR™, 5–15 interactive training mode, 5–14 main toolbar, 5–2 memory requirements, 2–2 preferences panel, 5–2 preview mode, 5–9 preview toolbar, 5–2, 5–3 program icon, 5–1 running from within another application, 5–15 sam
TextBridge-Specific Crash Problems, A–30 TIFF, 1–4, 3–13, 4–20, 5–3 Toolbars, 3–3 Train OCR button, 3–5, 4–28, 4–31, 5–12 Train OCR command, 3–20 Training Data command, 3–25 Training Data pop-up menu, 6–20 Training data, 1–5, 3–21, 4–28 reusing, 5–12 Training file, 3–25 tips for using, 6–19 Training Level options, 3–10 Training Level pop-up menu, 4–30 adjusting, 6–10 Training toolbar buttons, table of, 3–9 Training toolbar, 3–9, 3–21, 4–29, 5–2, 5–3, 5–13 Troubleshooting, A–1 Tutorials, sample documents, 5–
W What To Do if You Have a Problem, A–2 Word processor use to create custom dictionary,, 6–8 Word processor format selecting conversion to, 5–6, 5–8, 5–18 X XDOC format, 1–8 Xerox PARC, 1–6 Z Zone creating, 4–26 deleting, 4–27 moving, 4–27 resizing, 4–26 selecting, 4–26 to capture parts of a document, 6–17 Zone object, C–19 Zone template, 1–5 specifying , 2–8 tips for using, 6–18 Zone Template command, 3–25 Zone template file, 4–27 Zone Template pop-up menu, 6–18 Zones, 1–5, 3–20 changing Front-to-back
Zoom page image in preview mode, 4–25 Zoom In button, 3–7, 3–9, 4–31 Zoom In command, 3–18 Zoom In tool, 5–10 Zoom Out button, 3–7, 4–31 Zoom Out command, 3–18 Zoom Out tool, 5–9 Zooming in and out on the page image, 4–23 Index I–21