8.8
Table Of Contents
- Table of Contents
- Welcome to PReS Workflow 8.8
- System Requirements
- Basics
- Features
- The Nature of PReS Workflow
- About Branches and Conditions
- Configuration Components
- Connect Resources
- About Data
- Data Repository
- About Documents
- Debugging and Error Handling
- The Plug-in Bar
- About Printing
- About Processes and Subprocesses
- Using Scripts
- Special Workflow Types
- About Tasks
- Working With Variables
- About Configurations
- About related programs and services
- The Interface
- Customizing the Workspace
- PReS Workflow Button
- The Configuration Components Pane
- Components Area Sections
- Processes and Subprocesses
- Manipulate Global Variables
- Connect Resources
- PPS/PSM Documents
- Associate Documents and PReS Printer Queues
- Using the Clipboard and Drag & Drop
- Rename Objects in the Configuration Components Pane
- Reorder Objects in the Configuration Components Pane
- Grouping Configuration Components
- Expand and Collapse Categories and Groups in the Configuration Components Pane
- Delete Objects and Groups from the Configuration Components Pane
- Other Dialogs
- The Debug Information Pane
- The Message Area Pane
- The Object Inspector Pane
- The Plug-in Bar
- Preferences
- Other Preferences and Settings
- General appearance preferences
- Object Inspector appearance preferences
- Configuration Components Pane appearance preferences
- Default Configuration behavior preferences
- Notification Messages behavior preferences
- Sample Data behavior preferences
- Network behavior preferences
- PlanetPress Capture preferences
- OL Connect preferences
- PDF Text Extraction Tolerance Factors
- General and logging preferences
- Messenger plugin preferences
- HTTP Server Input 1 plugin preferences
- HTTP Server Input 2 plugin preferences
- LPD Input plugin preferences
- Serial Input plugin preferences
- Telnet Input plugin preferences
- PReS Fax plugin preferences
- FTP Output Service preferences
- PReS Image preferences
- LPR Output preferences
- PrintShop Web Connect Service preferences
- Editor Options
- The Process Area
- Zoom In or Out within Process Area
- Adding Tasks
- Adding Branches
- Edit a Task
- Replacing Tasks, Conditions or Branches
- Remove Tasks or Branches
- Task Properties Dialog
- Cutting, Copying and Pasting Tasks and Branches
- Moving a Task or Branch Using Drag-and-Drop
- Ignoring Tasks and Branches
- Resize Rows and Columns of the Process Area
- Selecting Documents in Tasks Links
- Highlight a Task or Branch
- Undo a Command
- Redo a Command
- The Quick Access Toolbar
- The PReS Workflow Ribbon
- The Task Comments Pane
- Additional Information
- Copyright Information
- Legal Notices and Acknowledgements
PDF Text Extraction Tolerance Factors
When extracting text from a PDF(for example, through a data selection), a lot more happens in
the background than what can be seen on the surface. Reading a PDFfile for text will generally
return text fragments, separated by a certain amount of space. Sometimes the text will be
shifted up or down, spacing will be different, etc. In some cases, every letter is considered to be
a different fragment.
Text formatting features such as kerning, bold, exponential, etc, may cause these fragments to
be considered as separate even if, to the naked eye, they obviously belong together.
The PDFText Extraction Tolerance Factors is used to modify the behavior of data selections
made from PDFdata files from within PReS Workflow. Each factor available in this window will
determine if two fragments of text in the PDFshould be part of the same data selection or not.
Warning
The default values are generally correct for the greatest majority of PDF data files. Only
change these values if you understand what they are for.
Delta Width
Defines the tolerance for the distance between two text fragments, either positive (space
between fragments)or negative (kerning text where letters overlap). When this value is at 0, the
two fragments will need to be exactly one beside the other with no space or overlap between
them.
When this value is at 1, a very large space or overlap will be accepted. This may case "false
positives" and separate words and text blocks may be considered as a single word if the value
is too high.
Accepted values range from 0 to 1. The default value is 0.3, recommended values are between
0.05 and 0.30.
Delta Height
Defines the tolerance for the height and position difference between two target fragments. The
higher the number, the more difference between the fragment's height(the tallest font
Page 744