ReadirisTM Corporate 12 User Guide
ReadirisTM Corporate 12 – User Guide Table of Contents Copyrights ........................................................................................... 1 Chapter 1 Introducing Readiris ................................................ 3 Save time, avoid retyping.................................................. 3 The Readiris series ............................................................ 7 Chapter 2 Installing Readiris .................................................. 11 System requirements ..
Table of Contents Opening image files ........................................................ 28 Scanning paper documents.............................................. 30 Chapter 6 Adjusting scanned documents ............................... 37 Chapter 7 Saving documents as image files .......................... 43 Chapter 8 Windowing documents........................................... 45 Windowing documents automatically ............................. 45 Windowing documents manually ......................
ReadirisTM Corporate 12 – User Guide Selecting the PDF options ............................................... 80 iHQC compressing PDF documents ............................... 81 Password protecting PDF documents .............................. 83 Digitally signing PDF documents ................................... 84 Repurposing PDF documents.......................................... 85 Creating XPS documents ................................................ 86 Selecting the XPS options ....................
Table of Contents Chapter 17 Recognizing barcodes......................................... 113 Chapter 18 Recognizing business cards................................ 117 Index vi ..................................................................................
ReadirisTM Corporate 12 – User Guide Copyrights ReadirisCorporate12-dgi-110209-04 Copyrights © 1987–2009 I.R.I.S. All Rights Reserved. I.R.I.S. owns the copyrights to the Readiris software, to the online help system and to this publication. The information contained in this document is the property of I.R.I.S. Its content is subject to change without notice and does not represent a commitment on the part of I.R.I.S.
ReadirisTM Corporate 12 – User Guide CHAPTER 1 INTRODUCING READIRIS SAVE TIME, AVOID RETYPING Congratulations on acquiring Readiris. This software package will undoubtedly be of great help in recapturing your texts, tables, graphics, barcodes and handprinted text. As efficient as computers are, you have to key in your information first. If you have ever retyped a 15 page report or a large table of figures, you know how tedious and time-consuming it can be.
Chapter 1 – Introducing Readiris spreadsheet, archive them as PDF or XPS files, etc. To recognize faxes and convert PDF documents, drag their image files from Windows Explorer to the Readiris application window. Or send an image promptly to Readiris via the context menu. Readiris recognizes tabular data and recreates them as worksheets in your spreadsheet software or as table objects inside your word processor; your numeric data are immediately ready for further processing.
ReadirisTM Corporate 12 – User Guide solutions you confirm are memorized, increasing the system speed and confidence and rendering the system more intelligent as you go along. This powerful learning tool also allows you to train Readiris on special characters such as mathematical symbols and dingbats and to handle distorted fonts. To increase your productivity further, Readiris not only recognizes your texts, but can format them for you as well. Various levels of formatting are available.
Chapter 1 – Introducing Readiris states, telephone and fax numbers, etc. The resulting data can be sent directly to your contact management software such as Microsoft Outlook (Express) or any vCard compliant application. Readiris is Twain compliant and supports a wide range of flatbed and sheetfed scanners, “all-in-one” devices or “MFPs” (“multifunctional peripherals”) and digital cameras.
ReadirisTM Corporate 12 – User Guide THE READIRIS SERIES The table below gives an overview of the available versions: Readiris Home 12 Limited features 25 recognition languages Supports PDF, DCX, DJV, DJVU, JPG, JPEG, J2C, J2K, JP2, PNG, TIF, TIFF, BMP, PCX images Generates PDF Image-Text, DOCX, ODT, WordML, SpreadsheetML, RTF, HTM, XML, TXT, TIFF, etc.
Chapter 1 – Introducing Readiris Readiris Pro 12 Asian Readiris Corporate 12 Asian Basic features Basic features 128 recognition languages 128 recognition languages Supports PDF, DCX, DJV, DJVU, JPG, Supports PDF, DCX, DJV, DJVU, JPG, JPEG, J2C, J2K, JP2, PNG, TIF, TIFF, JPEG, J2C, J2K, JP2, PNG, TIF, TIFF, BMP, PCX. BMP, PCX.
ReadirisTM Corporate 12 – User Guide BMP, PCX. BMP, PCX. Generates four types of PDF files, PDF- Generates four types of PDF files, PDF- iHQC (level I), four types of XPS, XPS- iHQC (level I-III), PDF/A, four types of iHQC (level I), DOCX, ODT, XLS, XPS, XPS-iHQC (level I), DOCX, ODT, WordML, SpreadsheetML, RTF, HTM, XLS, WordML, SpreadsheetML, RTF, XML, TXT, TIFF, etc. HTM, XML, TXT, TIFF, etc.
ReadirisTM Corporate 12 – User Guide CHAPTER 2 INSTALLING READIRIS SYSTEM REQUIREMENTS This is the minimal system configuration required to use Readiris: a 486-based Intel PC or compatible. A Pentium-based PC is recommended. 256 MB RAM. 120 MB free disk space. (105 MB of disk space suffices when you do not install the sample files) the Windows Vista, Windows XP or Windows 2000 operating system. Note: Readiris Corporate is optimised to use a screen resolution of at least 1,024 x 768 pixels.
Chapter 2 – Installing Readiris SOFTWARE INSTALLATION How to install Readiris: Log on to Windows as administrator or make sure you have the necessary administration rights. Connect your scanner to your PC and install the corresponding software. Test your scanner. If you experience any problem contact your scanner manufacturer. Insert the Readiris CD-ROM in the CD-ROM drive and follow the on-screen instructions to install the software.
ReadirisTM Corporate 12 – User Guide Repeat the installation process to install any additional software from the CD-ROM. UNINSTALLING THE SOFTWARE There is only one correct way to uninstall Readiris: by using the Windows (un)install wizard. You are strongly recommended not to uninstall Readiris or any of its software modules by manually erasing the program files. To uninstall Readiris: Close the application. On the Start menu, click Control Panel.
Chapter 2 – Installing Readiris be entitled to product support; be entitled to special offers on I.R.I.S. products. To register: Use the Registration wizard on the Register menu. Follow the onscreen instructions. PRODUCT SUPPORT Once you have registered your product, you are entitled to product support from I.R.I.S. on all basic software functionalities. Contact I.R.I.S. at: Europe: support.pro@irislink.com Tel:+32 10 45 13 64 USA: support.pro@irisusa.com Tel.:+1 800 447 4744 Asia-Pacific: support.
ReadirisTM Corporate 12 – User Guide CHAPTER 3 GETTING STARTED RUNNING READIRIS To run Readiris: Start Readiris from the Windows Start menu or double-click the shortcut on your desktop. If you acquired Readiris Corporate you will be prompted to register. Click anywhere in the startup screen to launch Readiris. The OCR Wizard automatically opens. USING THE OCR WIZARD The OCR Wizard allows you to quickly define all the settings needed to operate Readiris.
Chapter 3 – Getting started Step 1 Select the type of document you want to recognize. Readiris recognizes text pages, business cards and multiple business cards in a single scan. For more information, see the section Selecting the document type. Click Next to go to the next step. Step 2 Select the image source. You can capture images using your scanner or open image files. Select the rotation and deskewing options you want to use. For more information, see the section Selecting the options.
ReadirisTM Corporate 12 – User Guide For more information, see the section Selecting the document language. Click OK to save the settings. Click Next to go to the next step. Step 5 Click the Change button to change the output format or target application. The default target application is Microsoft Word. Select the required output format or application in the Send to or External file list. Click the various tabs and select the options of your choice.
Chapter 3 – Getting started The Readiris interface is composed of: the SmartTasks (in the middle) The SmartTasks are predefined commands that allow you to use the most frequent Readiris functions at the touch of a button. Use the SmartTasks to scan, recognize and send your documents to the target application or output format of your choice. The SmartTasks apply default settings but can be configured easily by right-clicking to fit more particular needs.
ReadirisTM Corporate 12 – User Guide Use the image toolbar buttons to edit documents in the Readiris interface. Point to the different buttons to display their tooltips. When a document has been opened or scanned in Readiris, three main zones are added to the interface: the page toolbar (right of the main toolbar) The page toolbar displays the page thumbnails, which provide settings information if pointed to.
Chapter 3 – Getting started CHANGING THE USER INTERFACE LANGUAGE The user interface of Readiris is available in a wide range of languages. To change the user interface language: On the Settings menu, click User Interface Language. In the Language list, select the required language, then click OK to confirm. Note: If you selected an incorrect language, click Ctrl+U. The Language dialog box will open and you will be able to select another language in the list.
ReadirisTM Corporate 12 – User Guide CHAPTER 4 THE READIRIS SMARTTASKS When starting Readiris, click anywhere in the Readiris startup screen and click Cancel when the OCR Wizard launches. The Readiris SmartTasks will be displayed. The SmartTasks are predefined commands that allow you to use the most frequent Readiris functions at the touch of a button.
Chapter 4 – The Readiris SmartTasks 1. Scan and recognize documents and send them directly to Word for text processing; Microsoft Word is the default target application. See the section Formatting text documents to learn more about the other available applications. 2. Scan and recognize documents and send them directly to OpenOffice for text processing; OpenOffice.org Writer is the default target application. See the section Formatting text documents to learn more about the other available applications.
ReadirisTM Corporate 12 – User Guide 7. Scan and recognize business cards. The documents will be sent in the vCard format by default. See the section Recognizing business cards to learn more about the other available formats. 8. Scan and recognize document batches and apply document separation and indexing options. TIFF is the default output format. See the sections Separating document batches and Indexing document batches for more information.
Chapter 4 – The Readiris SmartTasks Click Configure if applicable to select the Twain source. Then click OK to save the settings. For more information on the scanner settings and on scanning paper documents, see the section Scanning paper documents. o When you select Image files and click the SmartTask, Readiris opens the Input dialog box in which you can select the image files you want to process. For more information on opening image files, see the section Opening image files.
ReadirisTM Corporate 12 – User Guide For more information on the above-mentioned settings, see the sections Selecting the options, Scanning paper documents and Selecting the document language. Finally, click the SmartTask to use it. Readiris will go through the entire recognition process automatically.
Chapter 5 – Scanning documents CHAPTER 5 SCANNING DOCUMENTS SELECTING THE DOCUMENT TYPE Before scanning documents or opening image files in Readiris Corporate, you must select the document type. Readiris can process Text Pages, Business Cards and Multiple Business Cards in a Single Scan. Operation Click the Document Type button on the main toolbar and select the document type.
ReadirisTM Corporate 12 – User Guide SELECTING THE OPTIONS Before scanning paper documents or opening image files, you can select several image enhancement options. When enabled, these options will be applied during the opening and scanning of documents. Operation Click the Options button on the main toolbar to select several image enhancement options. o Click Page Deskewing to straighten pages scanned at an angle.
Chapter 5 – Scanning documents You can also use the windowing tools on the image toolbar to modify the page analysis results or to window documents manually. For more information, see the chapter Windowing documents. When you are done defining all the settings (Scanner settings, Document type, Options), click the Scan or Open button to scan documents or open image files. Note that the above-mentioned options are also available on the Settings menu.
ReadirisTM Corporate 12 – User Guide Tip: you can also drag image files to the Readiris image window to open them. Tip: Right-click any image file you want to open, point to Open With and click IOCR application. The Readiris software will open and display the image. Tip: when loading multipage image files (TIFF images and DCX faxes) and PDF documents, you can define the page range (in case you only need a certain chapter of a document for instance). To do so, click Open on the main toolbar.
Chapter 5 – Scanning documents Avoid selecting this option when opening very low-quality images, however. Readiris supports the following graphic formats: Select the image file of your choice and click Open. Note: the options of the Input dialog box also apply to document scanning and are discussed in the Scanning paper documents section.
ReadirisTM Corporate 12 – User Guide When you process paper documents, Readiris will start your scanner as soon as you click the Scan button and display the scanned document in the interface. To scan documents: Click the Scanner button to set the scanner settings. Note that several of the options in the Scanner dialog box are also available in the Open dialog box. Select the correct scanner model. If your scanner is not in the list, select Twain other models and click OK.
Chapter 5 – Scanning documents Format and Resolution Readiris supports a wide range of paper formats and resolutions. Note that it is recommended to use a scan resolution of 300 dpi. Use a resolution of 400 dpi when recognizing business cards, Asian text or very small print. Color mode Readiris can scan documents and open image files in color, black-and-white and grayscale.
ReadirisTM Corporate 12 – User Guide Note that this option never increases the resolution of images scanned with too little detail. Scanning multipage documents When scanning multipage documents and using a scanner equipped with a document feeder, select the ADF (automatic document feeder ) option. Place the pages you want to scan in the feeder and start scanning.
Chapter 5 – Scanning documents Using a digital camera Select Digital camera when you are using a camera as scan source. Readiris uses special recognition routines to process digital camera images. Tips for using a digital camera as scan source: Calibrate the camera by photographing a white document. Always select the highest image resolution. Enable the macro mode of the camera to take close-ups. Only use optical zoom, not digital zoom. Hold the camera directly above the document.
ReadirisTM Corporate 12 – User Guide High-speed and duplex scanning When using a duplex scanner, the Duplex scanning option will be available. Select it to recognize front and back pages. Fast Binarization Make sure to select Fast Binarization when you are using a high-speed scanner. This option increases the processing speed considerably. Avoid selecting this option when scanning very low-quality documents, however.
ReadirisTM Corporate 12 – User Guide CHAPTER 6 ADJUSTING SCANNED DOCUMENTS When opening or scanning extremely light or extremely dark grayscale and color images, it may be necessary to adjust those images before executing the recognition, in order to obtain satisfactory OCR results. To adjust images: Open or scan a color-grayscale document. Make sure that the scanner settings are correct.
Chapter 6 – Adjusting scanned documents o Select Smoothen color image to even out the image. This option renders grayscale and color images more homogeneous by smoothening out differences in intensity. As a result, a stronger contrast is created between the foreground (text) and background (artwork). Note: sometimes smoothening is the only way to separate text from a colored background.
ReadirisTM Corporate 12 – User Guide o Use the slider to increase or decrease the Brightness. The Brightness settings determine the overall brightness of the image. Use these settings to darken or lighten the image when the text is illegible. Example 1: lighten a dark image to eliminate the page background. (Color image) (Binarized image.
Chapter 6 – Adjusting scanned documents (Binarized image. The default brightness settings yield fragmented characters) (The darkened image yields satisfactory recognition results) o Use the slider to increase or decrease the Contrast. The Contrast settings determine the contrast between darker and lighter zones of an image. Use these settings to make character shapes stand out against a colored background.
ReadirisTM Corporate 12 – User Guide Despeckling removes small spots from black-and-white images. Click Apply to preview the results. If the results are satisfactory, click OK. If not, change the settings again. Click Recognize + Save to recognize the document.
ReadirisTM Corporate 12 – User Guide CHAPTER 7 SAVING DOCUMENTS AS IMAGE FILES Paper documents you scan do not need to be OCRed right away. They can be saved as image files. To do so: Scan the document. On the File menu, click the commands Save Full Page as Image or Save All Pages as Image. Afterwards, open the saved image file and perform the recognition. Saving graphics only You can also choose to save the graphics windows without the text of the document. To do so: Scan or open the document.
ReadirisTM Corporate 12 – User Guide CHAPTER 8 WINDOWING DOCUMENTS WINDOWING DOCUMENTS AUTOMATICALLY When scanning or opening documents, Readiris will automatically apply Page Analysis to split up the documents in different windows. The Page Analysis option is selected by default. Click the Options button and disable Page Analysis should you want to avoid automatic page analysis. The page analysis results can be modified manually after automatic page analysis.
Chapter 8 – Windowing documents Page analysis detects text, graphic and table zones automatically. Barcode zones and handprinted zones need to be drawn manually. For more information, see the section Windowing documents manually. Each window type has its own color code: text windows are orange, graphics are purple and table windows pink. Barcode zones are green and handprinted zones blue. The windows are sorted top-down, left to right. Numbers indicate the sort order of the windows.
ReadirisTM Corporate 12 – User Guide The part of the page you select will be analyzed automatically. You will be prompted whether you want to exclude the same outer zone from page analysis on every page of the document. WINDOWING DOCUMENTS MANUALLY Besides windowing documents automatically by means of Page Analysis, Readiris allows you to window documents manually. Manual windowing comes in handy when having to modify the automatic page analysis results.
Chapter 8 – Windowing documents Draw a frame around the text blocks, graphics, tables, barcodes and handprinting zones you want to window. For more information on recognizing barcodes and handprinting, see the sections Recognizing barcodes and Recognizing handprinted text, respectively. When you are done windowing the document, click the Recognize + Save button to execute the OCR.
ReadirisTM Corporate 12 – User Guide ones. Whenever two windows of the same type intersect, they become a polygon automatically. Automatic page analysis Should the current page be too complex to window manually, click the Analyze page button on the image toolbar to window the page automatically. Note that barcode zones and handprinted zones always need to be drawn manually.
Chapter 8 – Windowing documents Right-click any of the selected windows, point to Window, then to Type and then click the required window type. Modifying the window size Click the window you want to modify. Place the mouse pointer over a marker (on the sides and in the corners of the window). Click the marker and drag the mouse to modify the window size. Moving windows Select the window you want to move. Click inside the window and drag the mouse to modify the position of the window.
ReadirisTM Corporate 12 – User Guide or Right-click the selected windows, point to Window, then click Delete. Deleting small windows Some documents, faxes for instance, often have "stray" dots on pages, causing Readiris to create superfluous windows that do not contain text. To erase all small windows, click Delete Small Windows on the Edit menu. This option erases all windows smaller than 0.5" and re-sorts the remaining zones.
Chapter 8 – Windowing documents On the File menu, click the command Load Layout. Select the layout file you saved. To apply the layout to all opened or scanned pages, select Apply Layout to All Pages in the Layout file dialog box. Click Open to load the layout file. Note that when you add a document to Readiris, the layout file must be loaded again as page analysis is enabled by default. Ignore exterior zone As an alternative to windowing templates, you can use the option Ignore exterior zone.
ReadirisTM Corporate 12 – User Guide Click Recognize + Save to execute the OCR.
ReadirisTM Corporate 12 – User Guide CHAPTER 9 USER INDEXING Before you recognize and save documents with Readiris, you can create a user index for each document. Readiris user indexes allow you to sort output files efficiently by subfolder, file name, subject and keywords. To create a user index: Scan the documents or open the image files you want to OCR with Readiris. Click the User index button on the main toolbar. The user indexing options are displayed.
Chapter 9 – User indexing Click Browse to select the required output folder. Click in the index field you want to use (subfolder, file name, subject and keywords) Then draw a frame around the text you want to use as index item. The text will be OCRed on the fly and inserted in the index field. Click OK to exit the user index settings and click Recognize + Save to recognize your documents. The documents will be saved in the (sub)folder and under the file name you specified.
ReadirisTM Corporate 12 – User Guide CHAPTER 10 RECOGNIZING DOCUMENTS INTRODUCTION To recognize documents, Readiris applies linguistics during the recognition phase. As a result, Readiris recognizes text, tables, graphics, barcodes and handprinted text in all kinds of documents. Readiris even copes with complex columnized documents, lowquality documents, faxes, dot matrix printouts, badly scanned and copied documents containing too light or dark font shapes, etc.
Chapter 10 – Recognizing documents allows to increase the system's accuracy. All solutions you confirm are memorized temporarily during recognition, increasing the system speed and confidence and rendering the system more intelligent as you go along. This powerful learning tool also allows you to train Readiris on special characters such as mathematical symbols and dingbats and to handle distorted fonts. The interactive learning results can also be stored permanently in font dictionaries for future use.
ReadirisTM Corporate 12 – User Guide The 5 most recently selected languages are moved to the top of the language list. Important: select the document language before executing page analysis when you are dealing with Asian, Hebrew and Arabic documents. Specific page analysis routines are used for these documents. The recognition can also be limited to a numeric character set to optimally recognize tables and figures.
Chapter 10 – Recognizing documents Recognizing documents with mixed languages Readiris also allows you to enable mixed character sets. That way Readiris switches languages in the middle of a sentence automatically and recognizes English words (proper names etc.) that occur in "exotic" languages. Click the globe button on the main toolbar and select the required language combination in the language drop-down list. Note: when processing Asian or Hebrew documents, mixed characters sets are used automatically.
ReadirisTM Corporate 12 – User Guide Tip: favor accuracy over speed when the image quality is rather poor. USING USER LEXICONS During recognition, Readiris is assisted by linguistic databases to recognize text correctly. These linguistic databases are standard lexicons and are available for every supported language. As powerful as these standard lexicons may be, the recognition accuracy can still be boosted by using customized user lexicons.
Chapter 10 – Recognizing documents Insert the words you want Readiris to recognize and click the Add button. You can also copy-paste text segments from other files and import and edit existing text files. Tip: importing company documents or word list may be the fastest way to create a user lexicon containing company-specific terminology. The terms you enter are sorted alphabetically. Duplicate words are rejected automatically. Click Save to save the .txt file in the folder of your choice.
ReadirisTM Corporate 12 – User Guide All punctuation symbols and special characters at the beginning and end of words are filtered automatically. Hyphens inside words are maintained. E.g. Notre-Dame-de-Paris stays Notre-Dame-de-Paris Tip: watch out for hyphenation at the end of a line when you import text files or copy-paste words that cover two lines. Numbers are rejected. Digits, however, can occur inside product names and are included. E.g.
Chapter 10 – Recognizing documents To select the font type: On the Settings menu, point to Font type. The font type is set to Automatic by default. That way, Readiris recognizes "25 pin" or "NLQ" (Near Letter Quality) dot matrix, or other "normal" printing. To recognize only dot matrix printed documents, click Dot matrix. Readiris will recognize so-called "draft" or "9 pin" dot matrix printed documents. Character pitch The character pitch is the number of characters per inch in a typeface.
ReadirisTM Corporate 12 – User Guide Important: these document characteristics do not apply to Asian, Hebrew or Arabic documents. USING INTERACTIVE LEARNING Readiris offers an interactive learning function. By means of Interactive learning you can train the recognition system on fonts and character shapes, and correct the OCR results if necessary.
Chapter 10 – Recognizing documents If the results are correct: o Click the Learn button to save the result as sure. The learning results are temporarily stored in the computer memory, for the duration of the recognition. Readiris will no longer display the learned characters when OCRing the rest of the document. When a new document is OCRed, the learning results are erased. To save learning results permanently, use a font dictionary. For more information, see the section Using font dictionaries.
ReadirisTM Corporate 12 – User Guide Use this button to prevent document noise from appearing in the output file. o Click Undo to correct mistakes. Readiris keeps track of the last 32 operations. o Click Abort to abort interactive learning. All learning results will be deleted. Next time you click Recognize + Save, interactive learning will start again.
Chapter 10 – Recognizing documents To use an existing font dictionary: On the Learn menu click Font Dictionary. Select the dictionary you want to use and click Open. On the Learn menu click either Append Font Dictionary or Read Font Dictionary. When selecting Append Font Dictionary, make sure to enable Interactive Learning. Readiris will recognize the character shapes stored in the dictionary and use interactive learning, allowing you to store new information in the dictionary.
ReadirisTM Corporate 12 – User Guide CHAPTER 11 FORMATTING AND SAVING DOCUMENTS FORMATTING DOCUMENTS The documents you OCR in Readiris can be saved in various output formats. Readiris saves OCR results as Adobe Acrobat PDF files, Microsoft XPS files, Word, WordML, RTF and OpenDocument text files, HTML and XML files, SpreadsheetML worksheets, and Ansi and Unicode text files.
Chapter 11 – Formatting and saving documents Readiris either: o sends documents to an application, which will open automatically, or; o saves documents as an external file. The option Send by e-mail creates a new e-mail message and inserts the recognized document as e-mail attachment. Click the different tabs to select the settings you want to apply. Settings that are unavailable for the selected output format appear dimmed.
ReadirisTM Corporate 12 – User Guide of a document, click Document Properties on the File menu. Note that the document properties options are also accessible in the Output File dialog box, which opens when you click Recognize + Save. Note that when saving a multipage document as external file, you can create a separate output file for each page in Readiris or save all pages that belong to the same document to a single output file.
Chapter 11 – Formatting and saving documents Layout options The option Create body text avoids text formatting by Readiris. Readiris generates a continuous, running text. The option Retain word and paragraph formatting takes an intermediate position between body text and autoformatting. The font type, size and type style are maintained across the recognition. The tabs and the alignment of each block are recreated. The text blocks and columns aren't recreated; the paragraphs just follow each other.
ReadirisTM Corporate 12 – User Guide o The option Use columns instead of frames creates columnized documents. Columnized texts are easier to edit than documents containing multiple frames: the text flows naturally from one column to the next. Note: when the system is unable to detect columns in the source document, this formatting mode uses frames as a fallback position. o The option Insert column breaks inserts a hard column break at the end of each column.
Chapter 11 – Formatting and saving documents The option Merge lines into paragraphs enables automatic paragraph detection. Readiris wordwraps the recognized text until a new paragraph starts, and "reglues” hyphenated words at the end of a line. The option Include graphics includes the graphics in autoformatted files. This is essential to create a true copy of a document.
ReadirisTM Corporate 12 – User Guide Click the Paper size tab and use the arrow buttons to apply and exclude paper sizes. Readiris will go through the active paper sizes in the indicated order and will use the first paper size that is sufficiently large to hold the scanned document.
Chapter 11 – Formatting and saving documents For more information on formatting options, see the section Formatting text documents. SpreadsheetML options When selecting Microsoft Excel 2002/2003 as target application, specific SpreadsheetML options are available. Click the tab SpreadsheetML options to display them: Note that the layout option Recreate source document becomes unavailable when this format is selected.
ReadirisTM Corporate 12 – User Guide The option Convert figures into numbers encodes recognized figures as numbers. As a result, you can execute arithmetical operations on those cells. The text cells (in any table) remain text. Note that only figures inside tables are encoded as numbers. Excel exclusively executes mathematical operations on data that is encoded as numbers. The option Create one worksheet per page sees to it that one worksheet is created per scanned page.
Chapter 11 – Formatting and saving documents The option Merge lines into paragraphs enables automatic paragraph detection. Readiris wordwraps the recognized text until a new paragraph starts, and "reglues” hyphenated words at the end of a line. The option Retain colors of background recreates the background color of each cell. Paper sizes Depending on the format you selected, you can indicate preferred paper sizes: Click the Paper size tab and use the arrow buttons to apply and exclude paper sizes.
ReadirisTM Corporate 12 – User Guide CREATING PDF DOCUMENTS Readiris generates four types of PDF output: Text, Text-Image, Image-Text and Image. To generate PDF output: Click the Format button on the main toolbar and select the PDF type of your choice in the Send to or External file drop-down list: PDF Image When you select PDF Image, Readiris generates image-only PDF documents, it does not execute OCR.
Chapter 11 – Formatting and saving documents PDF Text-Image When you select PDF Text-Image, Readiris recognizes text and creates searchable PDF documents that contain the page image and the recognized text. The page image is contained beneath the text. SELECTING THE PDF OPTIONS To select the PDF options: Click the Format button on the main toolbar and select the PDF type of your choice in the Send to or External file drop-down list. Depending on the PDF type you select, several options are available.
ReadirisTM Corporate 12 – User Guide Create bookmarks The option Create bookmarks creates bookmarks for each text block, graphic and table in Adobe Acrobat PDF files. Embed fonts Select the option Embed fonts to embed fonts in Adobe Acrobat PDF files. Embedding fonts prevents font substitution and ensures that readers, regardless of their computer configuration, see the text in its original fonts. Embedding fonts increases the file size of recognized documents somewhat.
Chapter 11 – Formatting and saving documents To generate iHQC compressed PDF output: Click the Format button on the main toolbar and choose between the two output modes. In the Send to or External file list, select the PDF type of your choice: PDF Image-Text or PDF Image. On the PDF Options tab, select the required compression level. Readiris Pro supports Level I - Good size and Level I - Good quality compression.
ReadirisTM Corporate 12 – User Guide PASSWORD PROTECTING PDF DOCUMENTS Readiris allows you to limit access to PDF output by setting passwords. You can enter an open document password, which will be required to open the document and set a permissions password which will restrict printing and editing of the document. Warning: note that it takes password recovery software to recover forgotten or lost passwords.
Chapter 11 – Formatting and saving documents want to change these settings, you must enter the permissions password. The Readiris security settings are similar to the standard protection features offered by Adobe Acrobat. Note, however, that in Readiris the open document password and permissions password must be different. If a PDF document is protected with both types of passwords, either password can be used to open the document.
ReadirisTM Corporate 12 – User Guide ID in Adobe Acrobat. See the Acrobat documentation for more information. The author signature appears in the Signatures tab of Adobe Acrobat and Adobe Reader. REPURPOSING PDF DOCUMENTS Next to generating PDF documents, Readiris can also repurpose PDF files: Readiris converts image PDFs into text PDFs or any other supported text format and unlocks read-only PDF content. Warning: Readiris does not open user password-protected PDF documents.
Chapter 11 – Formatting and saving documents CREATING XPS DOCUMENTS Readiris generates four types of XPS files: Text, Text-Image, Image-Text and Image. XPS stands for XML Paper Specification and is a fixed-layout format developed by Microsoft. To generate XPS output: Click the Format button on the main toolbar and select the XPS type of your choice in the Send to or External file drop-down list: XPS Image When you select XPS Image, Readiris generates image-only XPS documents, it does not execute OCR.
ReadirisTM Corporate 12 – User Guide The page image is not contained in these single-layered XPS files. XPS Text-Image When you select XPS Text-Image, Readiris recognizes text and creates searchable XPS documents that contain the page image and the recognized text. The page image is contained beneath the text. SELECTING THE XPS OPTIONS To select the XPS options: Click the Format button on the main toolbar and select the XPS type of your choice in the Send to or External file drop-down list.
Chapter 11 – Formatting and saving documents Create bookmarks The option Create bookmarks creates bookmarks for each text block, graphic and table in Microsoft XPS files. IHQC COMPRESSING XPS DOCUMENTS Besides four types of "regular" XPS output, Readiris offers iHQC compressed XPS output. XPS documents of the types Image-Text and Image can be hyper-compressed by means of iHQC. iHQC stands for intelligent High-Quality Compression, I.R.I.S.' proprietary, efficient compression technology.
ReadirisTM Corporate 12 – User Guide SELECTING THE GRAPHICS OPTIONS Depending on the output format and target application you select, advanced graphics options may be available. The graphics options can be used to alter the image quality and resolution. To access the graphics options: Click the Format button on the main toolbar and select the output format of your choice in the Send to or External file drop-down list. Click the Graphics tab to display the options.
Chapter 11 – Formatting and saving documents Tip: When saving documents as HTML files to post on a web site, reduce the resolution to 70 dpi (screen resolution). JPEG quality Graphics stored inside PDF, XPS, Word and RTF documents are saved in the JPEG format. Use the slider to adjust the JPEG quality. JPEG 2000 compression When saving files in the PDF or XPS format, Readiris can apply JPEG 2000 compression to the color-grayscale images stored inside those files.
ReadirisTM Corporate 12 – User Guide CHAPTER 12 SAVING AND LOADING SETTINGS Any settings you specify in Readiris are saved automatically for future use after you close the application. To restore the factory settings, click the command Restore Factory Settings on the File menu. When scanning various groups of documents which all require different settings, it is useful to save separate settings files for each group. Operation Select the settings you want to use for a certain document group.
ReadirisTM Corporate 12 – User Guide CHAPTER 13 RECOGNIZING MULTIPAGE DOCUMENTS OPENING AND RECOGNIZING MULTIPLE IMAGE FILES Readiris is designed to process multiple image files at a time. To open multiple image files: Click Open on the main toolbar.
Chapter 13 – Recognizing multipage documents options. Note that Readiris processes documents alphabetically so the empty file must immediately follow the last file of the document. For more information, see the section Separating document batches. Click the Open button to open the image(s). Note that you can also drag-and-drop image files from Windows Explorer to the Readiris image window to open them. The page toolbar will display the opened image files.
ReadirisTM Corporate 12 – User Guide Note: when you are processing large volumes of scanned documents, use the functions Batch OCR or Watched Folder. SCANNING AND RECOGNIZING MULTIPAGE DOCUMENTS Readiris is designed to process documents consisting of multiple pages. Readiris Pro processes documents of up to 50 pages. Readiris Corporate processes documents of an unlimited number of pages.
Chapter 13 – Recognizing multipage documents Scanning multipage documents with interval scanning (flatbed scanner) Click the Scanner button on the main toolbar. Select Scan another page after and indicate the time interval using the arrow buttons. The scanner will automatically scan another page after the indicated number of seconds without you having to click the Scan button every time. Click Abort in the interval scanning dialog box to end the automatic scanning or press ESC on the keyboard.
ReadirisTM Corporate 12 – User Guide Moving a page inside a document: Right-click the page you want to move and click Select Page. Drag the page to the correct position. Or right-click a page and click Move Page Up or Down. Deleting a page: Right-click the page you want to delete and click Delete page. Or select the page and hit the Delete button on your keyboard. Excluding a page from recognition: Right-click the page you want to exclude and click Exclude page.
Chapter 13 – Recognizing multipage documents Or clear its page number box in the document panel. Excluded pages are stricken out in the page toolbar. Excluded pages are ignored when you print the scanned images and when you save the scans to multipage image files. Tip: the commands Include All Pages and Exclude All Pages on the Edit menu apply to all pages simultaneously. Using a page as cover page: Right-click the page you want to use as cover page and click Cover page.
ReadirisTM Corporate 12 – User Guide CHAPTER 14 RECOGNIZING LARGE VOLUMES OF SCANNED IMAGES EXECUTING BATCH OCR Readiris offers a powerful functionality for recognizing batches of scanned images: Batch OCR. Batch OCR executes the recognition on all scanned images in a specific folder. Indicate to Readiris in which folder your documents are located, start the OCR process and all your documents will be converted to the required output format.
Chapter 14 – Recognizing large volumes of scanned images These folders may be different but do not need to be. Click the Text Format button to select the required external file format and its options. For more information on the formatting options, see the chapter Formatting and saving documents. Select the processing options: o Select Process subfolders to process all subfolders of the image folder.
ReadirisTM Corporate 12 – User Guide The recognized documents get the same file name as the original image files. SETTING UP A WATCHED FOLDER Next to executing Batch OCR, Readiris can monitor a Watched Folder. Any image files you place or change inside the watched folder, will be processed by Readiris. You can leave the OCR software running day after day. Operation Before setting up a Watched Folder, first specify the OCR options.
Chapter 14 – Recognizing large volumes of scanned images The text folder must be different from the image folder. One folder must not be a subfolder of the other either. Click the Text Format button to select the required external file format and its options. For more information on the formatting options, see the chapter Formatting and saving documents. Click OK to monitor the Watched Folder. Readiris processes the images of all supported file formats.
ReadirisTM Corporate 12 – User Guide CHAPTER 15 SEPARATING AND INDEXING DOCUMENT BATCHES SEPARATING DOCUMENT BATCHES When scanning or opening multiple documents it is essential to indicate to Readiris where one document ends and the other begins. You can do this by means of blank pages or barcode pages. Separating scanned documents Insert a blank page or barcode page between the different documents in your scanner's document feeder.
Chapter 15 – Separating and indexing document batches Select Detect blank pages or Detect cover pages with a barcode, depending on the type of separator page you are using. Readiris will detect blank pages or barcode pages and mark them as cover pages. A page is blank when it only contains noise. When using barcodes a separators you can indicate the barcode read zone, indicating the position of the barcode on the page, and have Readiris search for specific content the barcodes should contain.
ReadirisTM Corporate 12 – User Guide For more information on barcodes, see the section Recognizing barcodes. When using a duplex scanner, select Duplex scanning. Readiris will disregard the rear sides when searching for blank pages and barcode pages. Click the Scan button to scan the documents. The scanned images will be displayed in Readiris and the blank pages will be marked as cover pages. Click the Recognize + Save button to process the documents.
Chapter 15 – Separating and indexing document batches Click the Recognize + Save button to process the documents. The option Create one file per document in the Output file dialog box will be selected by default. That way Readiris will create a new output file each time it encounters a blank page. Readiris, by default, analyses cover pages and includes them in the output file.
ReadirisTM Corporate 12 – User Guide Select Detect blank pages or Detect cover pages with a barcode. A page is blank when it only contains noise. When using barcodes, indicate the barcode read zone, if necessary, and/or indicate specific content Readiris should look for. On the Settings menu, click Barcodes and select which barcodes you want Readiris to recognize. For more information on barcodes, see the section Recognizing barcodes.
Chapter 15 – Separating and indexing document batches Select Generate an XML index. The text of the cover pages can be included in the XML index by selecting the corresponding option. Note that these reading results are not included in the output document. Click OK to save the document processing settings. Click the Recognize + Save button to process the documents. The XML index will be located in the same folder as the output document.
ReadirisTM Corporate 12 – User Guide CHAPTER 16 RECOGNIZING HANDPRINTED TEXT Next to typed text, tables, graphics and barcodes, Readiris recognizes handprinted text. Handprinting consists of separated block letters. It takes highly specialized ICR software (intelligent character recognition) to recognize handprinted characters. To recognize handprinting: Click the handprinting button on the image toolbar. Draw a frame around the handprinted text. Click Recognize + Save on the main toolbar.
Chapter 16 – Recognizing handprinted text Recognized symbols Handprinting recognition is limited to the Latin alphabet and supports numerals (0-9), uppercase letters (A-Z) and the punctuation symbols comma, period, plus sign and hyphen. Accents, umlauts and other special characters are not supported. Notes Readiris supports handprinting, not handwriting. For more information, see the section Handprinting rules.
ReadirisTM Corporate 12 – User Guide Use a sufficiently thick ballpoint. Black pens yield better results than blue pens. Do not use pencils. Don't stylize too much. Excessively stylized characters increase the risk of OCR errors. Don't open loops which should be closed, don't close loops which should be open. Avoid broken characters. Avoid retracing. Retracing reduces the image quality and clarity of handprinted symbols. Characters that are entirely stricken out will not be recognized.
Chapter 16 – Recognizing handprinted text The horizontal underlining bar does not have to touch the rest of the font form. Tip: when less than optimal results are obtained, use the I.R.I.S. writing form and adapt your writing style. The blank I.R.I.S. writing form serves as a full-page template on which block letters can be filled out correctly and in the right size. The form can be found on the Readiris CD-ROM and in the Readiris installation folder.
ReadirisTM Corporate 12 – User Guide CHAPTER 17 RECOGNIZING BARCODES INTRODUCING BARCODE READING Next to optical character recognition of 128 languages, Readiris also offers barcode reading. Barcodes can either be recognized manually or automatically when they are used for indexing purposes.
Chapter 17 – Recognizing barcodes On the Settings menu, click Barcodes. Select the symbologies you want Readiris to recognize. Determine whether you want Readiris to verify or remove the check digits. Click the barcode button on the image toolbar and draw a frame around the barcodes zones in the document. Click Recognize + Save on the main toolbar. The entire document including the barcode content will be recognized.
ReadirisTM Corporate 12 – User Guide Select Detect cover pages with a barcode. Indicate the barcode read zone (the position of the barcode on the page) if necessary, and/or indicate specific content Readiris should look for. Note: the barcode reading results can also be included in an XML index. Simply click the corresponding box. For more information on indexing, see the section Indexing document batches.
ReadirisTM Corporate 12 – User Guide CHAPTER 18 RECOGNIZING BUSINESS CARDS INTRODUCING BUSINESS CARD READING Next to recognition of "regular" documents, Readiris Corporate also offers business card recognition. Readiris allows you to scan business cards, recognize them and convert them into an address database.
Chapter 18 – Recognizing business cards Select the latter option when using a flatbed scanner. Note that the background must be black in order for Readiris to extract the various business cards. To create a black background, scan cards with the scanner lid open. When you have forgotten to select Multiple Business Cards in a Single Scan as document type, click the command Extract Business Cards on the Process menu. The various business cards will be extracted from the scanned image.
ReadirisTM Corporate 12 – User Guide Click Calibrate when using the scanner for the first time and insert the calibration sheet. Select the correct paper size, resolution and image type. Click the Scan button in the dialog box to scan the business card. Readiris will display the analyzed business card. Change the windows types, if necessary: right-click the window you want to change, point to Window then to Type. Then click the correct window type.
Chapter 18 – Recognizing business cards It is recommended to sort your business cards by country, as you can only activate one card style at a time. Click the Format button to select the output format. Business cards can be saved in the vCard, HTML and comma delimited text format or be sent to Microsoft Outlook, Microsoft Outlook Express, Lotus Notes and Palm Desktop. Click Recognize + Save to recognize the business card(s) and export them.
ReadirisTM Corporate 12 – User Guide INDEX A C accuracy vs. speed................ 60 calibration ........................... 119 ADF ..................................... 95 changing the user interface ... 20 adjusting scanned documents 37 character pitch ...................... 64 Arabic documents .........4, 9, 58 color image ..................... 33, 37 Asian documents .............. 7, 58 color mode ............................ 33 Asian edition .................4, 8, 58 contrast ...........
Index duplex scanning ................... 36 E editing multipage documents 96 Excel output ......................... 69 extracting business cards.... 118 F interval scanning................... 96 inverted images .................... 34 IRISCard ............................ 117 J JPEG 2000 compression ....... 90 L factory settings ..................... 91 language ............................... 58 fast binarization ............. 28, 38 layout files ............................ 52 font dictionaries ..
ReadirisTM Corporate 12 – User Guide options ................................. 28 rotation ................................. 28 output formats ...................... 69 RTF output ........................... 69 P S page analysis ........................ 28 saving settings ...................... 91 page deskewing .................... 28 scanner settings .................... 32 pages .................................... 97 secondary languages ............. 60 deleting.............................
Index user index ............................. 55 Word output.......................... 69 user indexing........................ 55 WordML output.................... 69 user interface........................ 17 worksheets ............................ 75 user interface language ........ 20 user lexicons ........................ 61 W X XML output .......................... 69 XPS iHQC output ................. 88 watched folder ................... 101 XPS options ..........................