7.6

Table Of Contents
Text formatting features such as kerning, bold, exponential, etc, may cause these fragments to be considered as separate
even if, to the naked eye, they obviously belong together.
The PDFText Extraction Tolerance Factors is used to modify the behavior of data selections made from PDFdata files from
within PlanetPress Workflow. Each factor available in this window will determine if two fragments of text in the PDFshould be
part of the same data selection or not.
The default values are generally correct for the greatest majority of PDF data files. Only change these values if you
understand what they are for.
Delta Width
Defines the tolerance for the distance between two text fragments, either positive (space between fragments)or negative
(kerning text where letters overlap). When this value is at 0, the two fragments will need to be exactly one beside the other
with no space or overlap between them.
When this value is at 1, a very large space or overlap will be accepted. This may case "false positives" and separate words
and text blocks may be considered as a single word if the value is too high.
Accepted values range from 0 to 1. The default value is 0.3, recommended values are between 0.05 and 0.30.
Delta Height
Defines the tolerance for the height and position difference between two target fragments. The higher the number, the more
difference between the fragment's height(the tallest font character's height)will be accepted and the more vertical distance
between fragments are accepted. Exponents, for example, are higher and lower.
When this value is 0, no vertical shift is accepted between two fragments. When the value is 1, the second text fragment can
be shifted by as much as the height of the first fragment.
Accepted values range from 0 to 1. The default value is 0.15, recommended values are between 0.00 and 0.50.
Font Delta Height
Defines the tolerance for the difference in average height of fonts in the two target fragments. The higher the number, the
more difference in average font heights will be accepted. The average font height is bigger in text written in uppercase than
text written in lowercase.
At 0, the font size must be exactly the same between two fragments. At 1, a greater variance in font size is accepted.
Accepted values range from 0 to 1. The default value is 0.65, recommended values are between 0.60 and 1.00.
Gap
Defines how spaces between two fragments are processed. If the space between two fragments is too small, the text extrac-
tion will sometimes eliminate that space and count the two fragments as a single word. To resolve this, the Gap setting can be
changed. The lower this value, the higher the chance of a space being added between two characters. A value too low may
add spaces where they do not belong.
Accepted values range from 0 to 0.5. The default value is 0.3, recommended values are between 0.25 and 0.40.
Logging User Options
Logging user options control the level of detail added to the PlanetPress Suite Workflow Tools log file. Since log files cover 24
hours of operation, choosing to log every task performed by PlanetPress Suite Workflow Tools may result in the creation of
The PlanetPress Suite Workflow Tools Configuration Program