AUTOMATED FORMS PROCESSING
Table of Contents Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3 Form Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3 What is a form? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3 Form structure . . . . . . . . . . . . . . . . . .
Automated Forms Processing
Form Types Introduction In the course of our lives we fill in hundreds of forms applica tion forms, questionnaires, insurance claims, etc. At the same time computers have become indispensable for collecting and managing information, making the task of extracting data from printed docu ments even more pressing.
Automated Forms Processing When completing a form one has to enter information into blank spaces or specially designed fields that make up the structure of the form. This information must then be extracted and processed. Forms from which data can be extracted, or "captured", automati cally by computer are called machine readable. Almost any form can be structured in such a way as to become machine readable.
Form Types Form types and design elements Forms can be divided into two major classes structured forms, on which the locations and sizes of all fields are exactly the same for all forms in a batch, and flexible forms, on which the sizes and locations of fields may vary from form to form. In order to capture data from a structured form, a program has to know where . to look for data.
Automated Forms Processing What is form processing? Forms processing is a process whereby information entered into data fields is converted into electronic form: entered data are "captured" form their respective fields forms themselves are digitised and saved as images. In most cases forms processing is considered complete when the data from all the forms have been captured, verified and saved into a database. It is also essential that the integrity of the captured data be preserved.
Form Types The cost of manual processing In the previous section you saw that the lump sum and running costs of manual forms processing add up to a pretty sum. And we have the first conclusion. Manual processing is expensive. But money is not the only problem associated with manual forms processing. You will need additional staff and another tier of management. Obviously it takes some time to set up a team of 8 10 employees and buy the necessary equipment.
Automated Forms Processing Automated forms processing An alternative is a data capture solution such as ABBYY FormReader. This is how FormReader works: A batch of completed forms is scanned using a high speed scanner (usually scanners that scan at least 10 pages per minute are used); Most of the data are recognized automatically; A few characters about which the program is uncertain are passed on to a human operator; Verified data are saved into a database.
Form Types OCR/ICR basics There are two major types of character recognition Optical Character Recognition (OCR) and Intelligent Character Recognition (ICR). OCR programs recognize characters printed using a printer, a plotter or a typewriter. ICR programs read docu ments filled in by hand in block letters (so called hand print recog nition). Let us consider the main differences between OCR programs and ICR programs.
Automated forms processing Automated forms processing: step by step Where data capture should be used? There are numerous situations when automated forms pro cessing is the only right solution. Here are some possible scenarios. Forms processing is not the main speciality of a company. Manufacturing or trading companies in most cases don't even have a department responsible for forms processing. Forms, such as order bills, are usually processed manually by secre taries or office assistants.
Automated forms processing: step by step Designing a form First of all you have to design a form. You need a form that is both easy to fill in and to process. The design is crucial because any mistakes made at this stage may drastically reduce the speed of processing. Be sure to follow the recommendations of the supplier of your data cap ture application. To create a form, you first need to think out its logical structure, then design it, and, finally, draw your form.
Automated forms processing FormReader also includes a very handy from drawing tool. FormDesigner is a form creation application provided with each copy of FormReader. This is a simple and efficient form drawer that will help you draw even the most sophisticated forms. All forms include certain typical design elements: titles, black squares, text fields, check boxes, etc.
Automated forms processing: step by step Setting up FormReader When you are setting up FormReader to capture data from a particular kind of form, you are "telling" the program where to look for data fields and what "hints" are available on printed forms. Setting up the program correctly is just as important as designing the form. Creating a form template. Below follows a brief treatment of all the steps you need to perform in order to create a form tem plate. 1.
Automated Forms Processing Specifying verification options. Selecting a scanner Choosing the right scanner is important because scanners have a direct impact on the speed and quality of processing. It should be noted that if you need to process more than 100 forms per day com mon flat bed scanners will not do.
Automated forms processing: step by step Personnel training Working with ABBYY FormReader requires minimum special knowledge and training. The data capture system is usually run by operators responsible for entering data from forms and an administrator who sets up and monitors the system.
Automated Forms Processing Below you can see a chart showing how forms are processed in ABBYY FormReader Enterprise Edition. There are two streams of data an input and an output stream. Each operator is responsible only for one processing stage, e.g. scanning and registering images in the system. The operators handle data as if they were working at an assem bly line.
Ensuring Data Quality Ensuring the quality of data Defining data quality In the previous sections we have often used the phrase "quality of data". By the quality of data we mean the completeness and accuracy of captured information. The higher the correspondence between the data exported into the database and the data entered into the fields of the paper forms, the higher the quality of data.
Automated Forms Processing Data type checks Even before submitting data for verification, ABBYY FormReader 6.0 will check the recognized data against dictionaries and user databases. Suppose your questionnaire has a field captioned "Your favourite brand of cheese". You can create a dictionary of cheese brands and use it to facilitate recognition. Dictionaries can be created for any data types to help the program more readily recognize the informa tion entered into the fields. ABBYY FormReader 6.
Ensuring Data Quality . Verification To improve recognition accuracy, ABBYY FormReader 6.0 may submit data for manual verification by the operator. FormReader offers three verification methods 1. Group verification. This is the ideal method for checking data belonging to a particular limited set, e.g. digits. Group verifica tion groups together uncertainly recognized characters of the same kind (e.g. all 3's) and displays them to the operator.
Automated Forms Processing Data format checks Once FormReader finishes recognizing the data, it will check whether the results conform to the format specified in the template. Let is take a closer look at this type of check using a Serial Number field as an example.
Ensuring Data Quality Controlling logic Very often certain restrictions apply the data to be entered into the fields. For example, numbers may have to belong to a certain interval. This can also be checked by validation rules, and if the rec ognized data do not meet the imposed requirements, the rule will report an error. Here are some examples of such validation rules: Normalize and check dates.
Automated Forms Processing Normalize prices. The price function automatically converts the price into the required format, e.g. 12.90 or 12,90 (Russian style). If the recognized data cannot be converted into the required format the program will report an error. Conditional checks. The user can use a special language, which is very much similar to programming languages, to specify cer tain conditions and actions to be performed by the program if these conditions have or have not been met.
Organizing Automated Forms Processing Organizing automated forms processing If you take into account such factors as the quality of entered data, the speed of processing and the working conditions of the operators, data capture applications are hard to beat. Automated data capture becomes economically viable whenever you need to process 100 forms per day or more. But even relatively small processing volumes will require certain changes in how the working process is organized.
Automated Forms Processing Back office data capture A good example of the second approach is the processing of tax returns. The Russian Ministry of Taxes has adopted the following sys tem of processing tax returns: for a period of several months tax returns are to be gathered from citizens by local tax officials, after which the collected documents are shipped to a central site for pro cessing. A very powerful data capture solution is required to process huge numbers of tax returns.
Organizing Automated Forms Processing Data capture basics Batch processing Processing queues Batches are collections of forms. Each batch has a unique iden tifier. An important advantage of this approach is that it structures information streams and facilitates administration, routing and storage of data. Batch routing is an important concept in data capture. Batch move ment cannot be arbitrary but should be optimised to reflect the logic of forms processing.
Automated Forms Processing Production capture Production capture has its own specifics. Large scale projects require dedicated software and hardware, special training for the personnel and careful organization. Forms processing software Experience shows that processing more than 3,000 forms per day by more than three operators requires a distributed software solu tion. Each operator will be able to concentrate on their specific task and deliver the best quality.
Using ABBYY Technologies to Solve Untypical Tasks Using ABBYY Technologies to Solve Untypical Tasks Sometimes ABBYY products may look like unlikely solutions for some not very typical or unusual tasks. Indeed, why select FormReader for processing forms, say, in Portuguese if FormReader does not support this language? However, FormReader can be effectively used even in cases similar to the one above.
Automated Forms Processing Remote scanning and processing faxed forms Images to be processed may be received from sources other than the scanner. If, for some reason, you cannot deploy FormReader directly at the location where the forms are completed and gathered, you can use remote scanning or gather forms by fax. Imagine a situation where a survey is being conducted in sev eral cities.
Using ABBYY Technologies to Solve Untypical Tasks Processing "flexible" forms Previously we divided all forms into two large classes those with rigidly structured fields and those with "flexible" or "floating" fields. Structured forms are best processed using form templates, whereas unstructured forms require a different approach. ABBYY has developed a method for capturing data from unstructured forms that delivers the same level of quality as obtained with structured forms.
Automated Forms Processing Capturing data from forms that are not machine readable Sometimes it is not necessary to capture all the data available on a document. This is particularly true when digitising archives. In this case only certain fields are selected for recognition. The program creates a unique index on the basis of these fields and converts the document images into a suitable storage format.
Conclusion Conclusion We have considered the main features of automated data capture. We started by introducing the terms and concepts commonly used by data capture professionals, then took a closer look at forms processing proper, concentrating on the most important aspects. We proved that automated forms processing has a lot of advantages over manual typing and showed how the quality of capture could be improved.
Automated Forms Processing Contacts ABBYY Software House (Moscow) P.O. Box#54 Moscow, Russia, 129301 Tel.: +7 095 783 4700 Fax: +7 095 783 2663 formreader@abbyy.ru ABBYY Europe GmbH Anglerstrasse 6, Munich, Germany, 80339 tel.: +49 89 511159 0 Fax: +49 89 511159 59 sales@abbyyeu.com ABBYY Ukraine P.O. Box#23 Kyiv,Ukraine 02002 Tel: +380 44 490 9999 Fax: +380 44 495 2080 sales@abbyy.ua ABBYY USA 3823 Spinnaker Court, Fremont, CA 94538 Tel.: +1 510 226 6717 Fax: +1 510 226 6069 sales@abbyyusa.