3.5

Table Of Contents
ABBYY Recognition Server 3.5 System Administrator’s Guide
8
Import
On this stage images are placed to the Input folder of the workflow. There are several ways to pass document images for
processing. The images can be manually placed to the Input folder, automatically passed from the Scanning Station, or
sent by e-mail.
When image documents get in the Input folder or mailbox, they are imported by the Server Manager and transferred to
the Images subfolder of the ABBYY Recognition Server 3.5 temporary folder. The path to the Server Manager temporary
folder can be viewed and changed in the Recognition Server Properties Dialog Box of the Remote Administration
Console.
The image files are kept in the Images subfolder of the Server Manager temporary folder throughout the entire
conversion process. The Processing Stations, Verification Stations, and Indexing Stations receive copies of those images
for processing. This ensures that no files are lost in case an error occurs during the recognition, verification, or indexing.
When image files are submitted to ABBYY Recognition Server 3.5, the Server Manger creates jobs for them and queues
them for processing. If several workflows are set up, ABBYY Recognition Server will process jobs from all the workflows
simultaneously, within the single queue. The jobs will be arranged in the queue according to their creation time and
priorities.
Processing
The first job in the queue is sent to the first available Processing Station for recognition. If there are several Processing
Stations in the system, the Server Manager evenly distributes the jobs from the queue among these Processing Stations.
See Registering a New Processing Station.
A Processing Station can run several OCR processes (their number can be adjusted in the Remote Administration
Console). For optimal performance, the recommended number of processes for a station is N+1, where N is the number
of CPU cores on the station. Usually each OCR process gets one file at a time. For example, if a Processing Station runs
two OCR processes, it will recognize two files in parallel (they can belong to the same job or to different jobs). However,
if the file has many pages (e.g. several dozen) and there are no more than 5 jobs waiting in the queue, the big file will be
split into several chunks, and the chunks will be sent to different OCR processes, in order to get the work done faster.
When the Processing Station has finished processing the file, it returns the recognized file to the Server Manager and is
assigned the next job from the queue.
Document separation
After recognition, the pages in the job queue will be rearranged into documents according to the separation rule.
Document separation is performed within a task. Depending on the source specified in the Import stage, different
document separation methods are available. In addition to built-in document separation methods (by barcodes, blank
pages, etc.) separation using a script can be performed. See Configuring Document Separation.
Verification
If verification is turned on in the workflow settings, documents that require verification will be queued for verification
after recognition. If there are Verification Stations connected, the Server Manager will route the queued documents to
those stations. If no Verification Stations are currently connected, or the users logged on the stations are not permitted
to verify documents from this workflow, the documents will wait in the queue in the "Queued for verification" state.
They will not be passed for further processing until they are verified. See Configuring Verification.
Indexing
If there are any document types specified in the workflow settings for indexing, documents from this workflow will be
indexed before export. Indexing can be performed automatically with the help of a script or/and manually on an
Indexing Station. Firstly indexing using a script is performed if a script is entered, then documents that require manual
indexing are queued for indexing. If there are Indexing Stations connected, the Server Manager will route the queued
files to those stations. If no Indexing Stations are currently connected, or the users logged on the stations are not
permitted to index documents from this workflow, the document files will wait in the queue in the "Queued for
indexing" state. The document will not be exported until it is indexed. See Configuring Document Indexing.
Export
When the recognition, verification, and indexing are completed, the output files are handed back to the Server Manager
and queued for publishing. The Server Manager delivers the output document to the destination specified in the job
settings. After the output file is published to the Output folder, the image copy is removed from the Server Manager
temporary folder. Then published files can be sent to appropriate destination depending on input and output files