Optical Character Recognition Program ABBYY FineReader ® Version 7.0 User’s Guide © 2003 ABBYY Software Ltd. All rights reserved.
Information in this document is subject to change without notice and does not bear any commitment on the part of ABBYY. The software described in this document is supplied under a license agreement. The software may only be used or copied in strict accordance with the terms of the agreement.
Contents Contents Welcome. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 Chapter 1 Installing and Starting ABBYY FineReader . . . . . . . . . . . . . . . . . . . . . . . . 9 Software and hardware requirements. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 Installing ABBYY FineReader . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
A B BY Y Fi n e Re a d e r 7 . 0 U s e r ’ s G u i d e Chapter 5 Page Layout Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . General information on page layout analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Block types. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Automatic page layout analysis options . . . . . . . . .
Contents Chapter 9 Working with Batches . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . General information on working with batches . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Creating a new batch . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Opening a batch . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
A B BY Y Fi n e Re a d e r 7 .
Welcome! Thank you for choosing ABBYY FineReader! ABBYY FineReader is an Optical Character Recognition (OCR) system that helps convert printed and PDF documents into editable formats while retaining the original layout of the document. The program allows users to create a digital copy of any document in minutes without man ually retyping it.
A B BY Y Fi n e Re a d e r 7 . 0 U s e r ’ s G u i d e User’s Guide The User’s Guide introduces you to the basics of using ABBYY FineReader. Each chapter starts with a short summary description and a list of the chapter’s contents. Online Help FineReader’s online Help contains basic and advanced information on program features, set tings and dialogs. Online Help is provided in HTML format and has been designed for quick and easy information retrieval.
Chapter 1 Installing and Starting ABBYY FineReader This chapter provides detailed instructions on installing ABBYY FineReader, outlines the system requirements of the program and offers instructions for installing the program on workstations and networks. ABBYY FineReader 7.0 includes a specialized installation program that automates the setup process.To insure proper instal lation, always use the ABBYY FineReader CD ROM for installation.
A B BY Y Fi n e Re a d e r 7 . 0 U s e r ’ s G u i d e Software and Hardware Requirements ABBYY FineReader 7.0 requires the following: 1. PC with Intel® Pentium®/Celeron®/Xeon™, AMD K6/ Athlon™/Duron™ or compatible processor. Processor must be 200MHz or higher 2. Microsoft® Windows® XP, Microsoft® Windows® 2000, Windows® NT® 4.0 with Service Pack 6 or greater, Windows® ME/98 (for working with localized inter faces, corresponding language support is required) 3. 64 MB (Windows XP/2000/NT 4.
C h a p t e r 1 . I n st a l l i n g a n d S t a r t i n g A B BY Y Fi n e Rea d e r Installation options During the installation, you will be asked to select one of the two installation options: ● Typical (recommended) – This option installs all components of the pro gram, including all recognition languages. You will be prompted to choose a single interface language during installation.
A B BY Y Fi n e Re a d e r 7 . 0 U s e r ’ s G u i d e Starting ABBYY FineReader To start ABBYY FineReader: ● Select the ABBYY FineReader 7.0 Professional Edition (Corporate Edition) item in the Start/Programs menu. Note: Make sure your scanner is connected to your computer, plugged in, and turned on before you start FineReader. To install a scanner after installing the program, please consult the user guide supplied with the scanner for installation instructions.
C h a p t e r 1 . I n st a l l i n g a n d S t a r t i n g A B BY Y Fi n e Rea d e r Step by Step Activation Instructions The built in Activation Wizard will quickly and efficiently activate the program. A friendly user interface collects and sends all necessary activation information directly to ABBYY. You will also use the Activation Wizard to enter the Activation Code (Professional Edition) or Activation File (Corporate Edition) that you receive from ABBYY during activation.
A B BY Y Fi n e Re a d e r 7 . 0 U s e r ’ s G u i d e ABBYY’s Activation Privacy Policy Activation may be required to access the full functionality of FineReader 7.0. This process veri fies that you are installing a genuine ABBYY product. ABBYY guarantees that activation of the product does not entail the communication of personal information to ABBYY. In fact, activa tion may be completely anonymous, if desired.
Chapter 2 Quick Start This chapter will teach you how to input a document in a few easy steps, even if you know nothing about how ABBYY FineReader works! If you already know how to use ABBYY FineReader, you may wish to skip this chapter and go to the chapter called “New fea tures of ABBYY FineReader 7.0”.
A B BY Y Fi n e Re a d e r 7 . 0 U s e r ’ s G u i d e How to Input a Document in Less than a Minute 1. Turn on your scanner prior to starting ABBYY FineReader. (Many scanner models require the unit to be turned on before you start your PC.) Next, turn on the computer and start ABBYY FineReader (Start/Programs/ABBYY FineReader 7.0 Professional Edition or Corporate Edition). The main window of ABBYY FineReader will appear on your screen. 2. Place the document on the scanner. 3.
Chapter 2. Quick Start Find the FineReader main menu at the top of the FineReader Main window. Four toolbars are displayed on the main menu: Standard, Formatting, Image Tools, and WizardBar. You may display or hide any toolbar by clicking on the View menu and selecting the Toolbar. You can also right click on any toolbar to open the local menu and then click on the name of the toolbar that you want to display or hide (currently selected toolbars are highlighted).
A B BY Y Fi n e Re a d e r 7 . 0 U s e r ’ s G u i d e pointer in the Text window to the position you clicked on (if text has already been recog nized on the page). You can customize the on screen windows arrangement To alter the on screen windows arrangement: ● In the View menu, select one of the following items: Batch Window; Image and Text Windows; Zoom Window.
Chapter 2. Quick Start The WizardBar The buttons on the WizardBar launch the main FineReader functions: Scanning, Reading, Checking and Saving recognition results. The numbers on the buttons indicate the order in which the document input actions should be performed. You may perform each action sepa rately or combine them into a single action by clicking the Scan&Read Wizard button to perform the full document processing cycle automatically. Each button offers several function modes.
A B BY Y Fi n e Re a d e r 7 . 0 U s e r ’ s G u i d e 2 – Read Read – reads the open batch page. Read All – reads all unrecognized batch pages. Options – opens the Recognition tab (Options dia log) to allow you to set document recognition options. 3 – Check Spelling Check Spelling – searches the text for misspelled and uncertain words (i.e. those words where character rec ognized was uncertain). Options – opens the Check Spelling tab (Options dialog) to allow you to set spelling checker options.
Chapter 2. Quick Start The Formatting toolbar The Formatting toolbar features various text formatting tools. You can edit and format text in the Text window. The Image toolbar The Image toolbar features page layout analysis (e.g. block creation and editing) tools, as well as tools for scaling (increas ing/decreasing the size) and editing (eras ing portions of an image, for example) images. Note: Block creation and editing but tons may be used both in the Zoom and in the Image windows.
A B BY Y Fi n e Re a d e r 7 . 0 U s e r ’ s G u i d e Setting up the toolbar Note: Low monitor resolution may limit the number of buttons desplayed on ABBYY FineReader's toolbars. Although all of FineReader's functionality is available through the program menus, you must increase the monitor's resolution to display all available buttons. FineReader allows you to customize the Standard, Image and Formatting toolbars by removing or adding application command buttons. Each menu item has its own icon.
Chapter 3 General Features of ABBYY FineReader ABBYY FineReader is designed to help you easily convert docu ments into editable files. A single click of the Scan&Read button initiates the automated process so that you can start working without spending hours studying the User’s Guide. FineReader supports a wide range of formats and you can send recognized text to the application of your choice or save it into any support ed format.
A B BY Y Fi n e Re a d e r 7 . 0 U s e r ’ s G u i d e What is an OCR System? Optical character recognition (OCR) is the translation of optically scanned bitmaps of printed text characters into character codes, such as ASCII. An OCR system is an efficient way to help you turn printed/scanned documents, image or PDF files into files that can be edited, searched and otherwise manipulated on a computer.
C h a p t e r 3 . G e n e ra l Fea t u re s o f A B BY Y Fi n e Rea d e r text that you see in the FineReader Text window, a text you can edit and save in any conven ient format. New Features of ABBYY FineReader 7.0 Recognition Accuracy ● ● ● Recognition accuracy has been improved up to 25% over the previous version.
A B BY Y Fi n e Re a d e r 7 . 0 U s e r ’ s G u i d e New Saving Options ● ● ● A new saving format, Microsoft PowerPoint, supports the quick creation of new presentations or the editing of existing documents from PowerPoint slides or handouts. Results saved in Microsoft Word are smaller files than in previous versions. The program more accurately retains the formatting of documents with vari ous separators. In addition, new saving options for images have been added.
C h a p t e r 3 . G e n e ra l Fea t u re s o f A B BY Y Fi n e Rea d e r Network Capabilities of ABBYY FineReader Corporate Edition ● ● ● ● Network installation. FineReader Corporate Edition supports installation from servers to workstations using Active Directory, Microsoft Systems Management Server, and the command line. Support for multi–functional devices, including network MFPs. MFPs that combine the functionality of a scanner, printer, copier and fax are becoming increasingly popular.
A B BY Y Fi n e Re a d e r 7 . 0 U s e r ’ s G u i d e Supported Image Formats ABBYY FineReader opens image files in the following formats: PDF: Files in PDF format (Version 1.
Chapter 4 Acquiring the Image The quality of the source image greatly impacts recognition qual ity. In this chapter, you will learn how to scan documents for best results, how to open and read saved images (see the list of supported image formats in “Supported Image Formats” section), and how to process images to improve recognition quality (by eliminating scanning “dust” etc.).
A B BY Y Fi n e Re a d e r 7 . 0 U s e r ’ s G u i d e Scanning ABBYY FineReader communicates with the scanner through a TWAIN interface. The TWAIN standard, which was adopted in 1992, is a universal standard that unifies the interaction between a computer image input device (such as a scanner) and an external application. ABBYY FineReader communicates with a scanner through a TWAIN driver in two ways: ● through the ABBYY FineReader interface.
C h a p t e r 4 . Ac q u i r i n g t h e I m a g e To start scanning: Click the 1 Scan button or select the Scan item in the File menu. The Image window containing a scanned imageof the page will appear in ABBYY FineReader’s Main window. To scan multiple pages simultaneously, click the arrow to the right of the 1 Scan button and select the Scan Multiple Images item. If scanning does not begin immediately, one of two dialogs will open: ● The scanner’s TWAIN Source dialog.
A B BY Y Fi n e Re a d e r 7 . 0 U s e r ’ s G u i d e ● ● ● Scan mode – black and white. Black and white scanning maximizes scanning speed but may result in the loss of some character information. This may lower recognition in documents of medium and low print quality. Scan mode – color. Select this mode for documents that contain pictures, colored text or colored backgrounds, so that you can retain the original colors. In all other cases, gray scan mode is preferable.
C h a p t e r 4 . Ac q u i r i n g t h e I m a g e Your image looks like this: Possible remedy: ● characters are “torn” or very light ● Lower the brightness (to make the image darker). Scan it in gray mode (to activate brightness autotuning). ● characters are distorted, glued, or filled Increase the brightness (to make the image brighter). ● Scan in gray mode (to activate brightness auto tuning ).
A B BY Y Fi n e Re a d e r 7 . 0 U s e r ’ s G u i d e Non–ADF Scanning ● If you are using the ABBYY FineReader interface, select Scan Multiple Images from the File menu. If you are using a flatbed scanner without an ADF and the ABBYY FineReader interface, there are two ways to increase its efficiency: ● Set a pause value (i.e. the time that will elapse between the scanning of one page and the next).
C h a p t e r 4 . Ac q u i r i n g t h e I m a g e Opening Images You can recognize image files without using a scanner (see the list of supported image formats under “Supported Image Formats”). To open an image: ● ● ● ● Click on the downward pointing arrow to the right of the 1 Scan button and select the Open Image item in the local menu. An Open caption will replace the Scan caption on the button. Select the Open Image from the File menu.
A B BY Y Fi n e Re a d e r 7 . 0 U s e r ’ s G u i d e Acquiring Images from the Hot Folder (Corporate Edition only) Multifunction devices, which combine a scanner, printer and scanner into a single device, may be used with ABBYY FineReader to automatically acquire images. The folder monitoring feature directs the program to monitor a specified folder on a local disk, in a network, or on an FTP server for new items.
C h a p t e r 4 . Ac q u i r i n g t h e I m a g e Adding Bussiness Card Images to a Batch The most efficient way of inputting business cards is to fit as many cards as possible onto the scanner plate. After input, though, each card should be recognized as a separate page (particu larly if de skewing has been done). You may choose either automatic or manual splitting tools to separate the business card image into individual cards. Note: This process requires that the cards be arranged in a specific order.
A B BY Y Fi n e Re a d e r 7 . 0 U s e r ’ s G u i d e 2. Specify a number for the first scanned page in the Page number dialog, then select Odd and even separately in the Page numbering field. Select an order for the pages: ascending or descending to reflect the way in which the double sided pages have been entered into the automatic document feeder (i.e. whether the last page or the first page has been placed on top).
C h a p t e r 4 . Ac q u i r i n g t h e I m a g e Note: If you scan or open inverted images, select the Invert image item in the Image Preprocessing group on the Scan/Open Image tab (Tools>Options menu) prior to adding these images to the batch. 3. Rotate or Flip Image Recognition quality relies on the image having a standard orientation (the text should be read from top to bottom and all lines should be horizontal). ABBYY FineReader automatically detects page orientation during the recognition stage.
A B BY Y Fi n e Re a d e r 7 . 0 U s e r ’ s G u i d e 7. Print image You can print the image in the Image window, the pages selected in the Batch window, or all batch page images. To do this: ● Select Print image from the File menu. The Print dialog will open. Set the desired printing parameters (the printer to be used, number of pages to be printed, the number of copies etc.). 8. Undo the previous action ● Click the Undo button on the Standard bar .
Chapter 5 Page Layout Analysis Before starting the recognition process, ABBYY FineReader must know which image areas it needs to recognize. To achieve this, the page layout analysis process identifies text blocks, picture blocks, table blocks, and barcode blocks. In this chapter you will learn more about: when manual page analysis is necessary; what block types are available; how to edit blocks drawn using automatic layout analysis; and how to streamline the layout analysis with block templates.
A B BY Y Fi n e Re a d e r 7 . 0 U s e r ’ s G u i d e General Information on Page Layout Analysis Page layout analysis can be done either automatically or manually. In most cases, ABBYY FineReader manages the complex task of analyzing page layout by itself. Start automatic analy sis by clicking on the 2 Read button. Recognition and layout analysis are performed simulta neously. Note: Stand alone page layout analysis is also available (Process>Analyze Layout menu).
C h a p t e r 5 . Pa g e Layo u t A n a l ys i s Table – this is used for table image areas or for areas of text that are structured in a table. When the application reads this type of block, it draws vertical and horizontal separators inside the block to form a table. This block is represented as a table in the output text. You can draw and edit tables manually. Picture – this is used for image areas that contain pictures.
A B BY Y Fi n e Re a d e r 7 . 0 U s e r ’ s G u i d e Available document types : Auto detect layout – (set by default) Text layout is determined automatically. Recognition of all text types, including texts containing multiple columns or tables and pictures, is performed automatically. Single column – The text is formatted into one column. Use this option if auto matic page layout analysis incorrectly determines the text type as multi column.
C h a p t e r 5 . Pa g e Layo u t A n a l ys i s 2. Use the No merged cells in table option if your table has no merged cells in it. For example: Temperature Degrees centigrade Degrees Kelvin 273 0 100 373 – the Temperature cell is a merged cell Note: Do not select One line of text per cell and/or No merged cells in table if the text con tains tables with differing structures. Selecting these options may result in errors during layout analysis and may adversely affect recognition quality.
A B BY Y Fi n e Re a d e r 7 . 0 U s e r ’ s G u i d e To create a new block: 1. Select one of the following tools: – to draw a recognition area; – to draw a text block; – to draw a picture block; – to draw a table block. 2. Position the mouse where you want a corner of your block to be. Hold down the left mouse button and drag the mouse pointer to the point where you want the opposite block corner to be. 3. Release the mouse button. A frame will enclose the image area selected.
C h a p t e r 5 . Pa g e Layo u t A n a l ys i s To cut a rectangular block part: 1. Select the tool. 2. Click on the portion of the block you wish to cut. Press and hold down the left mouse button then drag the mouse pointer diagonally. Select the desired area and release the button. The selected rectangle will be cut from the block. 3. If necessary, move the block border. Note: 1. You can alter block borders by adding new nodes (splitting points).
A B BY Y Fi n e Re a d e r 7 . 0 U s e r ’ s G u i d e To renumber blocks: 1. Select the tool. 2. Click the blocks in the desired order. The contents of blocks will be displayed in the output text in the same order. Note: If you delete blocks on an image that has already been recognized, the recognized text in the Text window will also be deleted. To delete a block: ● ● Select the tool and click the block you wish to delete, or Select the blocks you wish to delete and press DEL on the keyboard.
C h a p t e r 5 . Pa g e Layo u t A n a l ys i s If the table cell only contains a picture, select Treat cell as picture in the Block Properties dialog (View>Properties menu). If the table cell contains both text and pictures, draw a sep arate picture block (or blocks) inside the cell. To merge table cells or rows: 7. Select the Merge Table Cells or Merge Table Rows in the Edit menu. Note: You can split previously merged cells using the Split Table Cells command (Edit menu).
A B BY Y Fi n e Re a d e r 7 . 0 U s e r ’ s G u i d e To load a block template: 1. Click the Batch Window and select the pages you want to apply the block template to. 2. Select the Load Blocks in the Image menu. The Open Blocks dialog will open. 3. Select the relevant block template file in the dialog. 4. Click the appropriate Apply to item in the group. The All pages item applies the block template to all batch pages, the Selected pages applies the block template only to selected pages. 5.
Chapter 6 Recognition The goal of OCR is to read a text from a source image and retain the source page layout. To succeed, however, the main recogni tion parameters (recognition language, font type of the source text, and document type) must be identified. This chapter deals with these parameters and other important recognition issues, including the use of different recognition settings.
A B BY Y Fi n e Re a d e r 7 . 0 U s e r ’ s G u i d e General Information on Recognition Note: Always confirm that recognition options (including recognition language, print type of the source text, and document type) have been set correctly prior to beginning recog nition. You may: 1. 2. 3. 4. Recognize a block or several blocks drawn on an image. Recognize an open page or all pages selected in the Batch window. Recognize all unrecognized batch pages. Recognize all pages in background mode.
C h a p t e r 6 . Re c o g n i t i o n Recognition Languages ABBYY FineReader recognizes documents containing a single or multiple languages. When recognizing documents in English or in German, you may also use the corresponding special ized medical or legal dictionaries in addition to the general purpose ABBYY FineReader dic tionaries. To set the text recognition language, select it in the drop–down list on the Standard toolbar. To recognize a multi–lingual document: 1.
A B BY Y Fi n e Re a d e r 7 . 0 U s e r ’ s G u i d e the language list on the Standard toolbar. The Recognition Language dia log will open. Select the desired language. 3. The language was disabled during custom installation. Note: Always use the folder that contains ABBYY FineReader.
C h a p t e r 6 . Re c o g n i t i o n Other Recognition Options Show image during recognition When processing large numbers of pages, recognition is invariably faster if the processed image is not displayed on–screen. To run recognition without displaying the image: ● Clear the Show image during recognition item on the General tab (Tools>Options menu).
A B BY Y Fi n e Re a d e r 7 . 0 U s e r ’ s G u i d e To stop background recognition: ● Select the Stop Background Recognition item in the Process menu. Note: The background recognition mode uses the recognition options that were active when the process started. Recognition with Training As noted, ABBYY FineReader can read texts set in practically any font regardless of print quali ty. Consequently, no prior training is required prior to recognition.
C h a p t e r 6 . Re c o g n i t i o n To train a user pattern: 1. Start Train user pattern mode by clicking the Train user pattern radio button on the Recognition tab, Tools>Options menu, in the Training group. The default pattern name (“Default”) will be displayed in the status line. 2. Click the 2–Read button. 3. Train your pattern by recognizing one or more pages in Train user pattern mode. Trained characters are saved in the default pattern.
A B BY Y Fi n e Re a d e r 7 . 0 U s e r ’ s G u i d e How to Train a User Pattern 1. Make sure the Train user pattern radio button is enabled on the Recognition tab (Tools>Options menu) in the Training group. 2. Click 2 Read. ABBYY FineReader will start recognition. When the program encounters an unknown character, the Pattern Training dialog will open, and the character image will be displayed..
C h a p t e r 6 . Re c o g n i t i o n 2. The system can be trained to retain character formatting. Select the corre sponding Italic or Bold item in the Pattern Training dialog and then click the Train button. 3. Training is case sensitive. During training, make sure that you use upper or lower case characters as appropriate. Correct mistakes made during training by clicking the Back button to return the frame to its previous position.
A B BY Y Fi n e Re a d e r 7 . 0 U s e r ’ s G u i d e To edit a user pattern: 1. Select Pattern Editor in the Tools menu. The Pattern Editor dialog will open. 2. Select the desired pattern and click Edit in the dialog. The User Pattern dia log will open. 3. Select a character and click Properties to edit the character caption and set the correct typeface: italic, bold, subscript or superscript. You may also click the Delete button to remove incorrectly trained characters from the batch.
C h a p t e r 6 . Re c o g n i t i o n How to Create a New Language To create a new recognition language: 1. Select the Language Editor item in the Tools menu. 2. Click on New, select the Create a Copy of the Language radio button. Then select the appropriate source language. 3. The Simple Language Properties dialog will open. Set the following language parameters for the new language in the Simple Language Properties dialog: 1. The new language name. 2.
A B BY Y Fi n e Re a d e r 7 . 0 U s e r ’ s G u i d e Note: The spelling checker will consider capitalization of words in the user dictionary to be correct if they are found in the text with any of the following capitalizations: diction ary set capitalization; lowercase only; uppercase only; sentence case capitalization (first letter capitalized, remaining letters lowercase).
C h a p t e r 6 . Re c o g n i t i o n To create a recognition language group: 1. Select Language Editor in the Tools menu and click the New button. A dia log will open. Select the Create a new group of languages item in the dialog. 2. The Language Group Properties dialog will open. Set the following new language group parameters (all parameters are set in the Language Group Properties dialog): 1. Group name. 2. Languages contained in the group. Note: 1.
A B BY Y Fi n e Re a d e r 7 .
Chapter 7 Checking and Editing Text After recognition, the Text window will display the recognized text. The Text window is ABBYY FineReader’s built in editor, and should be used to check recognition results and edit recognized text. The FineReader text editor has two distinctive features: 1. A built in spell check system (see the list of languages with spell check support under “Supported Languages” in ABBYY FineReader Help). 2.
A B BY Y Fi n e Re a d e r 7 . 0 U s e r ’ s G u i d e Checking Text in ABBYY FineReader ABBYY FineReader highlights uncertainly recognized characters and words not found in the dictionary in different colors. Usually, light blue is used for uncertain characters and pink for words not found in the dictionary.
C h a p t e r 7 . C h e c k i n g a n d E d i t i n g Te x t Note: You can enlarge the Check Spelling dialog to make it easier to check and edit text. Simply click the dialog border; the mouse pointer will become a double headed arrow. Drag the border to make the dialog larger or smaller. 4. If words have been misspelled, you may: ● Click the Ignore button to leave the word unchanged. ● Click the Ignore All button to leave all misspelled words in the text unchanged.
A B BY Y Fi n e Re a d e r 7 . 0 U s e r ’ s G u i d e Check and Edit Text Options These options are set on the Check Spelling tab (Tools>Options menu). ● Error display level Note: This option must be set before you start recognition.
C h a p t e r 7 . C h e c k i n g a n d E d i t i n g Te x t Correct spaces before and after punctuation marks This option will automatically correct spacing before or after punctuation marks without stop ping the spell check. Adding and Deleting Words to/from the User Dictionary Adding words to the user dictionary Enlarging the dictionary increases recognition quality. During recognition, ABBYY FineReader checks every word it encounters for possible dictionary entries.
A B BY Y Fi n e Re a d e r 7 . 0 U s e r ’ s G u i d e 3. The following languages support paradigms: Armenian (Eastern, Western, Grabar), English, Italian, French, German (Old and New spelling), Russian, Spanish and Ukrainian. The program will notify you if you try to add a word that already exists in the dictionary. You may view its paradigm and construct a new one if you think the existing paradigm is incorrect (as with homonyms, for example. Click the Add button in the Add Word dialog. Tip: 1.
C h a p t e r 7 . C h e c k i n g a n d E d i t i n g Te x t After recognition, the page text is displayed in the Text window. When you send your text to an external application, the layout retention options mandate how the text layout is retained. Set these options on the Formatting tab (Tools>Options menu) and in the format dialogs. The program automatically highlights uncertainly recognized characters.
A B BY Y Fi n e Re a d e r 7 . 0 U s e r ’ s G u i d e The FineReader built–in editor is supplied with the following text editing features: ● ● ● ● ● Copy, cut, paste Search and replace Font effects Text alignment Undo and redo Copy, cut, paste 1. Before you use the copy, cut, and paste commands, highlight the relevant text. 2.
C h a p t e r 7 . C h e c k i n g a n d E d i t i n g Te x t To search and replace a word or phrase in the text you are editing: 1. Try one of the following: ● Select Replace in the Edit menu, or ● Press CTRL+H 2. The Replace dialog will open. Type the word or the phrase you want to find in the Find what line of the dialog, type the word or phrase that is to replace the search pattern in Replace with line, and set the search parameters. Font effects 1.
A B BY Y Fi n e Re a d e r 7 .
Chapter 8 Saving into External Applications and Formats You can choose to save recognition results to a file, export them to an external application without saving them, copy them to the Clipboard or e mail them in a supported file saving formats. You can save specific pages or all of the pages in the document. FineReader can export recognition results to the folowing applications: Microsoft Word 6.0, 7.0, 97 (8.0), 2000 (9.0), 2002 (10.0) and 2003 (11.0); Microsoft Excel 6.0, 7.0, 97 (8.0), 2000 (9.
A B BY Y Fi n e Re a d e r 7 . 0 U s e r ’ s G u i d e General Information on Saving Recognized Text You may: ● ● ● ● save recognized text using the Save Wizard, save opened or selected pages to a file or send them to an external applica tion, save all the batch pages to file or export them into an external application, save page image. Send recognition results to a certain application or save them to a file by clicking on 4 Save. The icon appearance changes to reflect the current sav ing mode.
C h a p t e r 8 . S av i n g i n to E x t e r n a l A p p l i ca t i o n s a n d Fo r m a t s Formatting and text layout retention modes (saving in RTF/DOC/Word XML, PPT and HTML formats) ● ● ● Retain full page layout – Document layout is fully retained, including para graph arrangement, font type and font size, columns, text direction, text color, and table structure. Retain font and font size – Table structure, paragraph arrangement, font type, and font size are all retained.
A B BY Y Fi n e Re a d e r 7 . 0 U s e r ’ s G u i d e The JPEG format uses a "quality loss" algorithm to compress the image (i.e. the compressing technology averages groups of pixels and saves the entire region as a single number rather than assigning numbers to each pixel). The quality of the image will be determined by the value specified in the JPEG quality field (Tools>Formats Settings, PDF, RTF/DOC/Word XML, PPT and HTML tabs).
C h a p t e r 8 . S av i n g i n to E x t e r n a l A p p l i ca t i o n s a n d Fo r m a t s ● ● Create a new file at each blank page – This option treats the entire batch as a set of page groups that contain a blank page at the end of each group. Pages from different groups are saved into different files with file names con sisting of the user specified name and index number: 1, 2, 3, etc. Create a single file for all pages – All (or all selected) batch pages are saved as a single file.
A B BY Y Fi n e Re a d e r 7 . 0 U s e r ’ s G u i d e 3. Text over the page image – This option saves the entire image as a picture and saves text areas over the picture. 4. Text under the page image – This option saves the entire image as a pic ture and puts recognized text under it. This option is useful if you export your text to document archives: the full page layout is retained and the full text search is available in this mode. To set these options: 1. Select Formats Settings in the Tools menu.
C h a p t e r 8 . S av i n g i n to E x t e r n a l A p p l i ca t i o n s a n d Fo r m a t s To retain pictures in a HTML file: ● Set the Keep pictures option on the Formatting tab in the Options dialog (Tools>Options menu). Note: Each picture is saved into a separate *.jpg file. Determine the resolution and quality of the images on the HTML tab of the Formats dialog (Tools>Formats). HTML formats available: 1. Full (uses CSS and requires Internet Explorer 4.
A B BY Y Fi n e Re a d e r 7 . 0 U s e r ’ s G u i d e Saving the Recognized Text in PPT Format Set layout retention modes on the Formatting tab in the Options dialog (Tools>Options menu). Note: When you save text in the HTML format, ABBYY FineReader uses either the fonts specified on the Formatting tab in the Options dialog (Tools>Options menu) or those set during text editing in the Text window. Important! When saving results in the .
Chapter 9 Working with Batches Batches are the main data depository in ABBYY FineReader, and contain scanned images, recognized text and other data. Most ABBYY FineReader settings are batch settings: scanning, recogni tion, saving options, etc. User patterns, user languages and user language groups are also property of a batch. New batches can be assigned the default batch settings, the settings of the current batch, or settings saved in an *.fbt file.
A B BY Y Fi n e Re a d e r 7 . 0 U s e r ’ s G u i d e General Information on Working with Batches ABBYY FineReader automatically creates a new batch upon starting. A batch may contain up to 9,999 pages. Tip: Saving similar type pages (e.g. pages from the same book, those written in the same lan guage, or those with a similar layout) in the same batch is often useful, since it stream lines the work process. The Batch window displays a list of the pages contained in the open batch.
C h a p t e r 9 . Wo r k i n g w i t h B a t c h e s You may select several different pages, or a number of consecutive pages, or all of the batch pages in a row: ● To select a number of consecutive pages, hold down the SHIFT key and click the first and then the last page of the group you want to select. ● To select several pages, hold down the CTRL key and click the desired pages. ● To select all batch pages, activate the Batch window and choose the Select All item in the Edit menu or press CTRL+A.
A B BY Y Fi n e Re a d e r 7 . 0 U s e r ’ s G u i d e Adding Images to a Batch ● ● Select Open Image item in the File menu or press CTRL+O. Select the desired image(s) in the Open Image dialog. ABBYY FineReader will add the image to the open batch and copy the image to the batch folder. Note: You can also add images directly from Windows Explorer: 1. Select an image file or group of files in Windows Explorer. 2. Right click the selection and select Open with FineReader from the local menu.
C h a p t e r 9 . Wo r k i n g w i t h B a t c h e s Note : 1. To renumber all batch pages, select the All Pages item in the Renumber Pages dialog. 2. To renumber only part of a batch: ● Select the pages you wish to renumber in the Batch window. ● Select the Selected pages item in the Renumber Pages dialog. 3. To renumber selected pages continuously, select the Continuous page num bering option.
A B BY Y Fi n e Re a d e r 7 . 0 U s e r ’ s G u i d e Deleting a Batch Note: When a batch is deleted, all of its contents (including image and text pages, related files, user patterns, user languages, etc.) will be deleted, leaving the folder empty. To delete a batch: ● Delete Delete Batch in the Batch menu. To delete a batch page: 1. Select the page(s) you wish to delete in the Batch window. 2. Select Delete Page in the Batch menu or just press DEL.
C h a p t e r 9 . Wo r k i n g w i t h B a t c h e s Full–Text Search in Recognized Batch Pages Important! You need Internet Explorer 4.0 or later to accept this option. ABBYY FineReader allows you to search all recognized pages for words in every possible gram matical form. The search pattern may consist of one or several words.
A B BY Y Fi n e Re a d e r 7 .
Chapter 10 Network Document Processing ABBYY FineReader Corporate Edition is especially designed for network document processing. You must install a separate copy of ABBYY FineReader on each computer involved in network processing.
A B BY Y Fi n e Re a d e r 7 . 0 U s e r ’ s G u i d e 3. Group work with customized dictionaries for languages with dic tionary support ABBYY FineReader provides built in dictionaries for languages that support it. These dictionaries contain the most commonly encountered words, but may not include proper names, specialized technical terms, acronyms, etc.
C h a p t e r 9 . N e t w o r k D o c u m e n t Pro c e ss i n g Working with the Same Batch over a Network (Available only in ABBYY FineReader 7.0 Corporate Edition) 1. Create or open a batch and set up the necessary scanning and recognition options. Run ABBYY FineReader and open the batch to be processed on all of the involved computers. 2. Run Background recognition (Process>Start background recognition) on all computers involved in recognizing the batch. 3.
A B BY Y Fi n e Re a d e r 7 . 0 U s e r ’ s G u i d e Note: Recognition speed enhancements from this approach will be particularly noticeable in batches that contain a lot of pages. Group Work with the Same User Languages and Dictionaries (Available only in FineReader 7.0 Corporate Edition) Create a batch and set up the necessary scanning and recognition options. All of the user languages and dictionaries will be stored in one folder. The default is the batch folder.
C h a p t e r 9 . N e t w o r k D o c u m e n t Pro c e ss i n g Note: 1. You must give read/write access to all users who access a dictionary if you are using a folder in which the dictionaries of multiple users are stored. 2. User languages that are shared between multiple users are available as “read only” files (which do not allow changing any parameter of the user language that has already been created). Entries may be added or removed from the user dictionary.
A B BY Y Fi n e Re a d e r 7 .
Appendix Hot Keys and Glossary Chapter Contents: ● Hot Keys ● Glossary
A B BY Y Fi n e Re a d e r 7 .
A p p e n d i x . H ot Ke ys a n d G l o ss a ry Menu Command Shortcut Key Process Scan and read an image Open and read an image Start Scan&Read Wizard Analyze layout Analyze layout on all batch pages Read active or selected pages Read all batch pages Read active or selected blocks Ctrl+D Ctrl+Shift+D Ctrl+W Ctrl+E Ctrl+Shift+E Ctrl+R Ctrl+Shift+R Ctrl+Shift+B Tools Spell the recognized text Move to the next error or uncertain word. Move to the previous error or uncertain word.
A B BY Y Fi n e Re a d e r 7 . 0 U s e r ’ s G u i d e Glossary A B Abbreviation A shortened form of a word or phrase used to represent the whole. For example, MS DOS (for Microsoft Disk Operating System), UN (for United Nations), etc. Background recognition A special recognition mode that allows the user to edit and save already recognized pages while ABBYY FineReader recognizes other pages at the same time.
A p p e n d i x . H ot Ke ys a n d G l o ss a ry C I Code page A table that sets the interrelation between the character codes and the characters them selves. Users can select the characters they need from the set found in the code page. Ignored characters Any non letter characters found in words (e.g. syllable characters or stress marks). These characters are ignored during the spell check.
A B BY Y Fi n e Re a d e r 7 . 0 U s e r ’ s G u i d e O Omnifont system A recognition system that recognizes charac ters set in any font and font size without prior training. Open&Read A command that processes an image file: opens, analyzes the page layout; and recog nizes it. Optional hyphen A hyphen () that indicates exactly where a word or word combination should be split if it occurs at the end of a line (e.g. “autofor mat” should be split to “auto format”).
A p p e n d i x . H ot Ke ys a n d G l o ss a ry Scan&Read Wizard Runs a special Scan&Read mode. ABBYY FineReader guides you through document processing and provides advice on getting best results. Source Text Print Type A parameter reflecting how the source text was printed (on a laser printer or equivalent, on a matrix printer in the draft mode, on a typewriter).
A B BY Y Fi n e Re a d e r 7 .