2019.1

Table Of Contents
Settings for a PDF File
PDF files have a clear and unmovable delimiter: pages. So, the Input Data settings are not
used to set delimiters. Instead, these options determine how words, lines and paragraphs are
detected when you select content in the PDF to extract data from it.
For an explanation of all the options, see: "PDF file Input Data settings" on page326.
Settings for a database
Databases all return the same type of information. Therefore the Input Data options for a
database refer to the tables inside the database. Clicking on any of the tables shows the first
line of the data in that table.
If the database supports stored procedures, including inner joins, grouping and sorting, you can
use custom SQL to make a selection from the database, using whatever language the
database supports. The query may contain variables and properties, so that the selection will
be dynamically adjusted each time the data mapping configuration is actually used in a
Workflow process; see "Using variables and properties in an SQL query" on page337.
For an explanation of all the options, see: "Database Input Data settings" on page327.
Settings for a text file
Because text files have many different shapes and sizes, there are a lot of input data settings
for these files. You can add or remove characters in lines if it has a header you want to get rid
of, or unwanted characters at the beginning of your file, for example; you can also set a line
width if you are still working with old line printer data; etc.
It is important that pages be defined properly. This can be done either by using a set number of
lines or using a string of text (for example, the character P”), to detect on the page. Be aware
that this is not a Boundary setting; it detects each new page, not each new record.
For an explanation of all the options, see: "Text file Input Data settings" on page328.
Settings for an XML file
XML is a special file format because these file types can have a theoretically unlimited number
of structure types. The input data has three options that basically determine at which node level
a new record is created. You can:
l Select an element type to create a new delimiter every time that element is encountered.
l Enter an XPath to create a delimiter based on the node name of elements.
l Use the root node. If there is only one top-level element, there will only be one record
before the Boundaries are set.
Page 236