2022.2

Table Of Contents
Task properties
General Tab
l
EMF to XY group: Select this option if the file received by this task is a Windows print file. This
will prompt the task to perform the first phase of the process, and thus convert the file to an XML
file. If this option is not selected, the input file will not be converted to an XML file (note that the
task will fail if the file it receives is not an XML file). The settings included in this group fine tune
the process. They let you control precisely which text blocks are recognized as belonging
together in one line. This has particular affect when dealing with font size differences between
consecutive passages of text, the distance from one text passage to another (word distance) as
well as the base line offset (vertical distance). To find out if one text passage belongs to the one
found before it, first the vertical distance, second the horizontal distance and finally, the font size
difference are checked. Only if all three values lie within the tolerance are the two blocks recog-
nized as belonging together. Additionally, you can control text passages whose horizontal dis-
tance has been recognized as out of the tolerance, but whose type size difference and vertical
distance lie within the tolerance, outputting it in one line. At the output, these text passages are
separated by a tabulator (ASCII code 9).
l
Font size difference: Indicates the smallest acceptable factor between maximum and min-
imum font size within one line. A value of 0.60 means that with a ratio from maximum to min-
imum font size (in points), that is less than 0.60, two text passages are not recognized as
belonging together. For example, if two text passages are formatted with different font
sizes. Passage 1 with 10, passage 2 with 18 point. The ratio 0.56 is smaller than the adjus-
ted value 0.60. Therefore those two text passages are recognized as not belonging
together.
l
Word distance: Indicates the largest acceptable distance between two text passages, so
that they are still recognized as belonging together. This the factor the font's mean char-
acter width is multiplied with. The value for the mean character width is taken from the cor-
responding font's attributes (for texts which are printed justified, it is suggested to raise this
value up to about 2). For example, if the mean character width of the font example shown
here corresponds to the width of the blank character (for other fonts it may be another
sign). There is another text passage found whose horizontal distance is even bigger than
Page 617