User manual

Table Of Contents
Proofing and Editing Training 49
languages. The last category means Japanese, Chinese or Korean characters were not
detected. Verification takes place during image pre-processing, so the required recognition
language must be set before image loading.
Auto-layout and auto-zoning are recommended for Asian pages. This places all detected
texts
into text zones; by choosing an Asian recognition language you set Asian OCR to run in these
zones and that can automatically detect and transmit the text direction, coping with mixed
areas of horizontal and vertical texts on a page.
However, the zoning tool
lets you force vertical Asian recognition by manual zoning.
Draw rectangular zones with this tool. T
o manually zone horizontal Asian text, use the usual
text zone type. Do not use the two other vertical-text tools on Asian texts. Drawing a vertical
Asian zone does not automatically enable an Asian language, nor influence the language
auto-detection.
Digital camera images are accep
ted for Asian languages. However, the automatic 3D deskew
algorithm is unlikely to be useful - certainly not for vertical texts. Preferably use the standard
image loading command and perform manual 3D deskewing with the relevant SET tool if
required. In general, SET tools can be used on Asian images.
Recognized Asian pages appear in the T
ext Editor, provided your system has support for East
Asian languages - always with horizontal text direction. There is no need to specify Asian
fonts under Options/OCR, a default font is automatically applied - typically Arial Unicode
MS. Other Asian-capable fonts on your system can be chosen in the Text Editor. Editor
support allows text viewing and verifying - Formatted Text is recommended as formatting
level. Large-scale editing and spell-checking are better done in the target application.
Proofing, training and dictionary support are not available for Asian texts. Therefore, prior to
performing Asian OCR, go to the Proofing panel under Options and disable dictionary word
marking, automatic proofreading and IntelliTrain and ensure that no training file is loaded.
Redaction can be applied to Asian texts, either by selection or searching. The workflow step
Form Data Extraction should not be applied to Asian pages.
Typical output converters for Asian texts are R
TF, Microsoft Word, Searchable PDF or XPS.
The text direction will be as detected during pre-processing. Changes made in the Text Editor
- where text is horizontal - will be exported, also to vertical text. Plain Text converters are
available (Unicode TXT, Notepad) but here text direction is always horizontal.
Training
Training is the process of changing the OCR solutions assigned to character shapes in the
image. It is useful for uniformly degraded documents or when an unusual typeface is used