HP Neoview Transporter User Guide HP Part Number: 622791-002 Published: August 2010 Edition: Release 2.
© Copyright 2010 Hewlett-Packard Development Company, L.P. Legal Notice Confidential computer software. Valid license from HP required for possession, use or copying. Consistent with FAR 12.211 and 12.212, Commercial Computer Software, Computer Software Documentation, and Technical Data for Commercial Items are licensed to the U.S. Government under vendor’s standard commercial license. The information contained herein is subject to change without notice.
Table of Contents About This Document.........................................................................................................9 Intended Audience.................................................................................................................................9 New and Changed Information in This Edition.....................................................................................9 Related Documentation.....................................................................
5 Control File Organization and Syntax.......................................................................27 Creating a Control File..........................................................................................................................27 Control File Limits................................................................................................................................27 Control File Organization and Syntax..................................................................
truncate.................................................................................................................................................65 url..........................................................................................................................................................65 user........................................................................................................................................................66 7 Load and Extract Operations ......
A Control File Examples..................................................................................................85 B Enhanced Initial Load Control File Examples..........................................................115 Examples ............................................................................................................................................116 C Control File Editor and Control File Generator......................................................119 Control File Editor .
List of Tables 1-1 3-1 4-1 4-2 4-3 5-1 5-2 5-3 7-1 7-2 7-3 8-1 9-1 D-1 E-1 E-2 E-3 E-4 E-5 E-6 E-7 E-8 E-9 E-10 E-11 E-12 E-13 E-14 E-15 E-16 E-17 E-18 E-19 E-20 E-21 F-1 Transporter Client Software Components.....................................................................................16 Exception Tables: Columns Generated by Transporter.................................................................20 Encryption Files................................................................................
List of Examples 5-1 9-1 9-2 9-3 9-4 9-5 9-6 9-7 9-8 9-9 A-1 A-2 A-3 A-4 A-5 A-6 A-7 A-8 A-9 A-10 A-11 A-12 A-13 A-14 A-15 A-16 A-17 B-1 B-2 B-3 B-4 8 Supported Scalar Expressions ......................................................................................................44 Examples of Custom Log File Locations.......................................................................................80 Sample HP_TRANSPORTER_BASE_JOB_STATS File.........................................................
About This Document This manual describes how to install and manage HP Neoview Transporter. This product provides processes and commands for loading data into your Neoview platform or extracting data from it. You can install and execute the Transporter client from a system running Microsoft® Windows, Red Hat Enterprise Linux 5, AIX® 5L version 5.3, SunOS 5.10 ( Solaris 10), or HP-UX® 64 for the Itanium platform. Installation instructions are included in the README for the HP Neoview Transporter Java Client.
Neoview Query Guide Information about reviewing query execution plans and investigating query performance of Neoview databases. Neoview Transporter User Guide Information about processes and commands for loading data into your Neoview platform or extracting data from it.
• Reference Mapping Tables for Neoview Character Sets A hyperlinked collection of East Asian characters supported by Neoview character set functionality. Neoview SQL Reference Manual Reference information about the syntax of SQL statements, functions, and other SQL language elements supported by the Neoview database software. Neoview Client and Platform Compatibility Reference Information about compatibility of client products and the version of the Neoview release installed on the Neoview platform.
A group of items enclosed in brackets is a list from which you can choose one item or none. The items in the list can be arranged either vertically, with aligned brackets on each side of the list, or horizontally, enclosed in a pair of brackets and separated by vertical lines. For example: DROP SCHEMA schema [CASCADE] [RESTRICT] DROP SCHEMA schema [ CASCADE | RESTRICT ] { } Braces Braces enclose required syntax items. For example: FROM { grantee[, grantee]...
DAY (datetime-expression) DAY(datetime-expression) If there is no space between two items, spaces are not permitted. In this example, no spaces are permitted between the period and any other items: myfile.sh Line Spacing If the syntax of a command is too long to fit on a single line, each continuation line is indented three spaces and is separated from the preceding line by a blank line. This spacing distinguishes items in a continuation line from items in a vertical list of selections.
1 Overview The HP Neoview Transporter client provides a high-speed load and extract facility to move data to (load) and from (extract) the HP Neoview platform.
• “Control File Generator” (page 121) • “Control File Editor ” (page 119) For detailed information see: “Control File Organization and Syntax” (page 27) “Control File Options” (page 53) Client Components Table 1-1 Transporter Client Software Components Transporter Component Type Definition Transporter Java Client A Java application that has been validated on these platforms: Linux, Windows, HP-UX for Itanium, Solaris 10, and AIX.
2 Transporter Configuration This chapter addresses the following topics: • • “Client Directory Structure” (page 17) “The Transporter API” (page 17) Client Directory Structure Installation creates the environment variable NVTHOME for the default location of the Transporter base directory. The directory specification follows the conventions of your loader platform.
NOTE: 18 The Transporter API is not supported on a Windows platform.
3 Functions and Features In addition to standard load and extract, Transporter provides support for these additional features and functions: • “Enhanced Initial Load (EIL)” (page 19) • “Staging Tables” (page 20) • “Parallel Streams” (page 21) • “Parallel Query Extract” (page 21) • “Prehashing” (page 21) • “JMS Trickle Feed For Data Loads” (page 21) • “Transporter Job Statistics on the Neoview Platform ” (page 21) • “Reflexive Update” (page 21) • “Identity Columns” (page 22) • “Data Compression” (page 22) •
Table 3-1 Exception Tables: Columns Generated by Transporter Column Name Column Type _excp_err_num INTEGER DEFAULT NULL _excp_object_name VARCHAR(776) DEFAULT NULL _excp_partn_name CHAR(35) DEFAULT NULL _excp_file_num INTEGER DEFAULT NULL _excp_rec_num LARGEINT DEFAULT NULL _excp_load_id LARGEINT DEFAULT NULL NOTE: Transporter creates exception tables automatically.
• • You cannot perform recovery if a job failed during transfer from the staging table to the target table row, you must restart the entire operation. If recovery is not performed, the staging table is not deleted. Check the daily log file in NVTHOME/log/java to find the filename and drop the leftover staging table. The staging table is deleted automatically if recovery is performed. For information about recovery, see “Job Recovery” (page 80).
in the “Maps” (page 41) section of the control file to perform a reflexive update. See “Control File: Reflexive Update” (page 101). Reflexive Update Considerations • • • • Reflexive updates apply only to columns with numerical values. The input value provided for a reflexive update must match the range of values that are legal for the target column. You cannot use reflexive update on primary key columns. The update map entry option defaults to true for all non-primary key columns.
Using this feature increases pipeline utilization by allowing Transporter to make multiple data connections for loading data to a set of partitions simultaneously. In addition, you might find performance improvement when executing initial loads on empty tables during nonaudited or stage table loads. Multi-session side-tree inserts are disabled by default. To globally enable multi-session side-tree inserts, set this property in the nvt.
4 Security Because load and extract jobs connect to databases and access password-protected resources, it is necessary for Transporter to keep track of user IDs and passwords. This chapter addresses the following topics: • “Password Encryption” (page 25) • “Encryption Files ” (page 25) • “The Encryption Utility — nvtencrsrv” (page 26) Password Encryption Because Transporter accesses other systems, it transmits User IDs and passwords. Transporter uses PKCS #5V1.
%NVTHOME%\utils\nvtencsrv -o install The Encryption Utility — nvtencrsrv nvtencrsrv is a utility that installs the security directory and related files, adds encrypted words to the password encryption file, and deletes encrypted words when they are no longer needed. Only the administrator of the client platform can use this utility. Invoke nvtencrsrv with the suffix corresponding to the platform you are using. For example, if you used the jtrans.
5 Control File Organization and Syntax The control file is a text file that tells Transporter how you want your data moved from source to target for either loading or extracting purposes.
Table 5-1 Control File Organization Control File Section Options “Version” (page 29) N.A. “Options” (page 30) Global options for all Transporter jobs controlled by this file.
Table 5-1 Control File Organization (continued) Control File Section Options “Maps” (page 41) Maps data to another format when moved from source to target. Unique options for this section: • condition • update “Sources” (page 45) Location of sources and targets for data. These global options can be overridden in this section of the control file.
A control file always begins with the version attribute. No other sections can precede the version attribute. If a control file begins with the %include statement, the first included file that does not begin with another %include statement must begin with the version attribute. In other words, the first non-%include statement that Transporter sees must be the version attribute. versionnumber is the major version, followed by optional minor and patch versions.
• • • • • • • • • • • • • • • • • • • • • • • • • • • • • “encoding” (page 55) “endpart” (page 57) “endseq” (page 57) “errors” (page 58) “exceptiontable” (page 58) “faileddatafile” (page 58) “forcestaging” (page 59) “inlinesort” (page 59) “loadid” (page 59) “multipart” (page 60) “noaudit” (page 61) “nullstring” (page 61) “operation” (page 62) “parallel” (page 62) “parallelstreams” (page 62) “password” (page 63) “retries” (page 63) “rowsetsize” (page 63) “sorted” (page 63) “startpart” (page 64) “startseq” (
formatstring Defines the pattern for this field. For a load operation, formatstring tells Transporter how to interpret source file values. For an extract operation, formatstring tells Transporter how to write target file values. • Character data types Specify “default”. It is the only format supported. • Integral data types — decimal — values are interpreted as decimal values with an optional leading sign character — octal — values are interpreted as octal values with an optional leading sign character.
• Date/Time data types Use a combination of the following patterns to a specify date or time format: — %b The abbreviation for the month name in the current locale (for example, “Feb”, in the POSIX locale). — %C The first two digits of the year (19 or 20). — %d The two-digit day of the month as a number (01 to 31). — %e The day of the month in a two-digit, right-justified, blank-filled field (1 to 31). — %H The hour in the 24-hour clock representation (00 to 23).
— backslash (\\) Period (.) is the only supported separator between integral and fractional seconds.
The following are defaults and examples: — Date values ◦ Default: %C%y-%m-%d ◦ Example: The input characters “2007–10–05” can be processed without a format definition and is interpreted as October 5, 2007. — Time values ◦ Default: %H:%M:%s ◦ Example: The input characters “13:32:00” can be processed without a format definition and is interpreted as 1:32 p.m. — Timestamp values ◦ Default: %C%y-%m-%d %H:%M:%S ◦ Example: The input characters “2007–10–05 13:32:00.
• Interval data types Interval data types require a formatstring. There is no default for interval fields. These field specifiers are allowed: — For year-month ranges: ◦ %Y — year ◦ %M — month — For day-time ranges: ◦ %D — day ◦ %H — hour ◦ %M — minutes ◦ %S — seconds (can include fractional seconds) The values for specifiers must be positive integers. Specifiers must appear in order from day to seconds. For example, %M (minutes) cannot precede %H (hours).
Table 5-2 Typeformats: format_definition Date/Time Format String Examples (continued) Description Example "%m/%d/%y %H:%M" Specify a formatstring for a timestamp field with mixed delimiters, in the format of "10/1/2007 12:45" Two typeformats of datatype date, with formats of "20070131" and "2007-01-31", respectively. typeformats { formatname date "%C%y%m%d", formatname date1 "%C%y-%m-%d" } Dataformats The dataformats section describes the data format of sources and targets.
optionally qualified by {' | "} Specifies a quoting character for each data field. For information, see “Quote Processing For Load Operations” (page 77) field-definitions See Dataformats: Field Definitions for Source and Target (page 39). Dataformats: Fixed Length Format In a fixed-length record file, the width of each field in the record is consistent across all records. Records are separated by one or more system EOL characters.
Dataformats: Neoview SQL Table Format formatname sql fields fields-definition formatname The name you create for this delimited dataformat. The name must be unique among existing dataformats. Dataformats: Field Definitions for Source and Target field-definitions: ( field-definition [, field-definition ]... ) field-definition: fieldname datatype [format formatname] [options field-options] field-options ( option_name = option_value [, option_name = option_value ]...
a matching name is found. Subsequent columns with the matching name are not accessible within this dataformat. There are two workarounds for this problem: — Derived columns of the query, for example, the results of Neoview SQL expressions and aggregate functions, by default are identified by the name “(EXPR)”. If multiple derived columns are produced by a single query, use an AS clause in the query text to associate a name with each derived column.
Maps The maps section describes how one data format is mapped to another when data is moved from source to target. maps { map-definition [, map-definition ]... } map-definition: mapname from sourceformat to targetformat field-mappings: ( field-mapping-definition [, field-mapping-definition ) ]...
fieldname_target, fieldname_source fieldname_target and fieldname_source must refer to names defined in the source dataformat and target dataformat, respectively. For more information about dataformats, refer to “Dataformats” (page 37). For a simple mapping, where the source field and target field are the same and there are no specific map functions required (for example, expressions and identity columns), field-source-ref can be omitted.
Table 5-3 Datatype Mappings (continued) Source Field of Type... Can be Mapped to a Target Field of Type... timestamp any character type date time timestamp interval interval For interval mappings, both fieldname_source and fieldname_target must have an associated typeformat. Both typeformats must either be year-month, or day-time ranges. [add | subtract] are used for reflexive update and apply only to update and upsert operations.
Example: EMP_ID = identity loader Asks the loader to generate a key for target field EMP_ID. scalar-expression A simple expression composed of a fieldname_source, constant, and SQL functions connected by simple arithmetic operators. This expression must be syntactically valid for fitting into a corresponding SQL insert, update, or upsert statement. The maximum length of a scalar-expression is 256 characters. This option is not supported for extract operations and is not allowed on primary key columns.
update = “true” | “false” This option is only applicable for load operations where the operation type is update or upsert. If update is true, this field becomes part of the UPDATE clause of the generated SQL statement This option cannot be set to “true” for primary key columns if the operation is upsert. Transporter returns an error. The default is “false” for key columns and “true” for non-key columns. The condition and update options are ignored for insert operations.
sourcename file filename [options source-options] Specifies a data file as the data source/target. filename specifies the location of the file, including both the path and the file name. Transporter assumes the data file is uncompressed. Example: Specify a plain data file, datafile_1.dat, in the current directory with source name data1: data1 file "./datafile_1.dat" sourcename pipe pipename [options source-options] Specifies a named pipe as the data source/target. pipename is the name of the pipe.
— — — [FIRST/LAST N] syntax SQL table-valued functions stream access to tables Examples • Specify a source named “table” to access SQL table sch.dept: table1 jdbc table sch.dept • Specify a source named “table2” to access an SQL table with a delimited schema and table name: table1 jdbc table "SQL"."MyTable" • Specify a source names “query1” to access an SQL join query: query1 jdbc sql "select * from sch.
• • “url” (page 65) “user” (page 66) Options for jms sources: • “password” (page 63) • “retries” (page 63) • “tenacity” (page 65) • “timeout” (page 65) • “url” (page 65) • “user” (page 66) Jobs The jobs section specifies a list of Transporter jobs. A job is a collection of job entries. Jobs are processed serially, one at a time. This section starts with the keyword jobs followed by a list of job definitions surrounded by curly braces. The Java client accepts control files containing multiple jobs.
options one or more occurrences of job-options, separated by a comma (,). The last job-options should not use the comma separator. The options you define at the job level override global options defined in the options section.
options job-entry-options one or more occurrences of job-entry-options, separated by a comma (,). The last entry-option should not use the comma separator. Options defined at the job entry level override global options defined in the options section and override options defined at the job level. The following job entry options are supported: • “baddatafile” (page 53) • “faileddatafile” (page 58) Comments You can embed single-line and multiple-line comments anywhere in a control file.
Supported Data Types Transporter supports these data types: • Character types — char[acter] — varchar • Integral types — byte — int[eger] — byteint — smallint — largeint — longint — numeric — decimal • Floating point types — real — float — double • Date/time types — date — time — timestamp — datetime • Interval types — interval Supported Data Types 51
6 Control File Options autotran autotran = "true" | "false" The default is “false”, except for pipe sources, in which case the default is “true”. For a load operation: when this option is set to “false”, Transporter executes each statement in a global transaction. For a load operation: when this option is set to “true”, Transporter uses an independent transaction for each insert. NOTE: If a job fails with autotran set to true, it cannot be recovered.
This option is used for a load operation and uses a run-length encoding (RLE) algorithm to compress data during staging table and nonaudited loads. One of these options must be set to true for the load job: • “forcestaging” (page 59) • “noaudit” (page 61) • “inlinesort” (page 59) This option is ignored for extract operations. crossoveryear crossoveryear = number There is no default value. If crossoveryear is not specified, this option is ignored.
The default value is 1000. The minimum value is 1. The maximum value is 2147483647. NOTE: These values are advisory only. While this number can be considered a trigger for shutting down a job, Transporter continues to process data already in progress to completion. For this reason, there may be additional discard records beyond the number specified.
TIP: The Transporter NVTHOME/utils directory contains these programs that you can compile and execute to display character set encodings: • CharsetsSupported.java, which will display all the character set encodings supported by your Java installation. • DefaultEncoding.java, which will display the default encoding used by Java on your client system.
◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ — — ' (apostrophe) ( (open parenthesis) ) (close parenthesis) * (asterisk) + (plus sign) , (comma) - (hyphen) . (period) / (slash mark) The numbers 0 – 9 : (colon) ; (semi-colon) < (left angle-bracket) > (right angle-bracket) = (equal sign) ? (question mark) This property does not apply when loading to or extracting from UCS2 columns.
For a load operation when a pipe is read, Transporter discards any records read until the startseq record is seen. Transporter stops reading records once the endseq record has been seen. For an extract operation, Transporter writes the startseq record into the pipe before the first data record. After all records have been written, the endseq record is written. The endseq record is written even if errors are encountered. errors errors = number The default value is 1000. The minimum value is 1.
The default value is log/java/jobname_failedrecord.dat. For a load operation, this option specifies where to put source records that have a valid format but could not be written to the Neoview database. For example, a record that fails a data conversion step or violates a uniqueness constraint. Load records written to faileddatafile appear in the same format as the source file. This option is ignored for extract operations. forcestaging forcestaging = "true" | "false" The default is “false”.
where number is a non-negative integer of type LARGEINT. The default is 0 (zero). In this case, Transporter automatically generates a load ID at the start of the load operation. The loadid provides a way to uniquely identify exception records for a particular load operation when errors occur during the load. multipart multipart = "true" | "false" The default is “false”.
noaudit noaudit = "true" | "false" The default is false. For a load operation, this option specifies whether Transporter should use non-transactional database operations for improved performance. If the “operation” (page 62) option is not “insert”, the noaudit setting is ignored and noaudit is implicitly set to false. If parallel = true for a job, all targets in that job must have the same setting for noaudit. This option is ignored for extract operations.
For an extract operation: • • • This option specifies the character string that should be written to targets to represent NULL values. For delimited data formats, if nullstring is not specified, Transporter does not write any data characters to the target field to represent a NULL value. For fixed-width data formats, if nullstring is not specified, Transporter rejects any retrieved record containing NULL values and writes the record to the baddatafile file.
For a Neoview SQL extract operation: • • Transporter internally limits the maximum value of parallelstreams to the number of CPUs on the Neoview system. If this option is not specified, Transporter sets parallelstreams to the number of CPUs on the Neoview system divided by four.
For a load operation where the “operation” (page 62) is “insert”, if the “noaudit” (page 61) option is true, the sorted option must also be true, otherwise an error is returned. This option is ignored for extract operations.
If this option is not specified, Transporter determines an optimized value based on heap size (memory allocated), table being loaded, number of columns, types of columns, and the server memory utilization of the client. teamsize teamsize = 1 | 2 | 4 The default is 1. This option specifies the number of Neoview server processes, or connections, that you want to have insert data on a single parallel stream operating on a set of partitions.
• • JDBC connections to the Neoview system JMS sources There is no default value. The Transporter Java client uses this URL for its internal database connections. A JDBC connection URL must be of the form of:jdbc:hpt4jdbc://hostname:port/. See the Neoview JDBC Type 4 Driver Programmer’s Reference for complete details on JDBC connection URLs. Example: a JDBC connection URL for a Neoview system name neo0101.domain.com: jdbc:hpt4jdbc://neo0101.domain.
7 Load and Extract Operations The Transporter Client performs load and extract operations based on the content of a control file.
resubmitting it, and you want Transporter to perform recovery on the job. For more information about -recover, refer to “ The -recover Command Line Option” (page 80). Starting the Client on UNIX-type Systems Use this shell command to start the Transporter Client. You specify the control file depending on where the file resides. If the control file is not found, Transporter exits with an error. • The control file is in NVTHOME/conf: jtrans.sh {-run | -recover} MyControlFile.
The Transporter Job The units that make up a control file are called jobs. A job is a load or extract operation consisting of several job entries.
• • Set “parallelstreams” (page 62) = n Set “multipart” (page 60) = true CAUTION: If multipart = false, and parallelstreams is enabled, Transporter still moves data off the Neoview platform in multiple streams. However, all the streams merge into one file or named pipe. Ordering is not maintained if the files are merged for an ORDER BY query. See “multipart” (page 60) for details.
Requirements These are the requirements to use the Trickle Feed feature: • • ActiveMQ (JMS provider), which is included with the Java client package A separate (and dedicated) instance of the Transporter Java client A JMS trickle feed load is similar to loads from flat files or named pipes, except for the following differences: • Workflow When the Java client parses the control file and determines that the source is JMS, a trickle feed process known as a JMS Consumer is launched to consume data from the
• • • • The set of targets (that you have set in the “Sources” (page 45) section) must be the same for all sources in the job. If multiple job entries specify the same target, either all mapnames specified by the job entries must be identical, or they must use the same targetformat, the same fieldmappings, and a sourceformat with identical field names and field order. See “Maps” (page 41) for details about mappings.
— — All CHAR columns must be of compatible types – that is, either UCS2 or ISO88591, but not both in the same table. There are these differences in the display of data: ◦ If the data in the table is 0.00, it appears in the extract file as .00, the format used by SQL on the Neoview server.
8 Data Processing This chapter addresses these topics: “Data For Load Operations” (page 75) “Processing Data for Load Operations” (page 75) “Delimited Data” (page 76) “Considerations for Character Data” (page 78) Data For Load Operations This section addresses how Transporter processes data for a load operation, and how it treats delimited and fixed-width data.
Fixed—Width Data for Load Operations Transporter addresses and processes a fixed—width input field as follows, in this order: 1. If “nullstring” (page 61) is specified, determine whether the input field matches the nullstring value, ignoring leading and trailing white space. If it matches, the input is considered a NULL value. 2. If the target data type is not a character type, trim leading and trailing white space characters. 3.
— — — For Windows platforms: CR/LF (carriage return/line-feed) For Unix-like systems: LF (line feed) For the Macintosh family: CR (carriage return) If multiple record separators are specified, then each one will match exactly one newline or carriage return (not a system End Of Line, EOL). Only if the record separator is a single system EOL will Transporter match either newline Unix style, carriage-return newline (DOS style) or carriage return (Macintosh style).
— To qualify by double-quotes (""): optionally qualified by " — To qualify by single-quotes (""): optionally qualified by ' Considerations for Character Data • For Load Operations: — Character fields can include any characters valid for the character set of the target data type. — If an input field has fewer characters than the target column for target data type CHAR, values written to the target table are padded with trailing spaces. This does not apply to VARCHAR.
9 Troubleshooting This section covers job recovery and solutions to problems you might encounter. • “Logs” (page 79) • “Job Recovery” (page 80) • “Job Statistics on the Neoview Platform ” (page 81) Logs If errors occur during a data movement operation, Transporter logs appropriate messages to log files on the client, describing the errors encountered. Important messages are also logged to the console and EMS log on the Neoview platform.
Example 9-1 Examples of Custom Log File Locations Use this command to send log output here —DNVT.instance-log-dir=test /test/log/java/DBTransporter.log -DNVT.instance-log-dir=/home/test /home/test/log/java/DBTransporter.log Note that the Unix®-type platform shell translates log/java/DBTransporter.log to ./log/java/DBTransporter.log. Logging to the Console INFO messages are sent to standard out. WARN, ERROR and FATAL messages are sent to standard error.
CAUTION: • You must use the original control file to recover load jobs. If the control file has changed, Transporter reports an error and exits. • The original and unmodified data files must be available to Transporter during recovery. • Do not attempt to recover a job that is still running. • Do not attempt to recover a job that is already in the recovery process. Job Statistics on the Neoview Platform Job statistics on the Neoview platform are used by Transporter to facilitate job recovery.
— — Time for next retry operation if applicable End time of the job entry Control File Metadata Tables You can retrieve control file related data from these metadata tables on the Neoview platform: • HP_TRANSPORTER.CONTROL_FILES — list of control files including the host name, absolute path, and modification time. Use this list to locate a particular control file.
Example 9-5 Identify All Job Entries that Successfully Completed Within the Last Seven Days SELECT B.FILE_HOST, B.FILE_ABS_PATH, B.FILE_MODTIME, A.JOB_NAME, A.SOURCE_NAME, A.TARGET_NAME, A.START_TIME, A.END_TIME FROM HP_TRANSPORTER.BASE_JOB_STATS A, HP_TRANSPORTER.CONTROL_FILES B WHERE A.FILE_ID = B.FILE_ID AND A.END_TIME IS NOT NULL AND DATEDIFF(DAY, A.END_TIME, CURRENT_TIMESTAMP) < 7 ORDER BY B.FILE_HOST, B.FILE_ABS_PATH, B.FILE_MODTIME, A.JOB_NAME, A.SOURCE_NAME, A.TARGET_NAME, A.
Example 9-7 Identify All Jobs Executed From Host abc Using the Latest Version of Control File /home/control files/ControlFile.txt SELECT DISTINCT A.JOB_NAME FROM HP_TRANSPORTER.BASE_JOB_STATS A, HP_TRANSPORTER.CONTROL_FILES B WHERE A.FILE_ID = B.FILE_ID AND B.FILE_HOST = 'abc' AND B.FILE_ABS_PATH = '/home/control_files/ControlFile.txt' AND B.FILE_VERSION = (SELECT MAX(FILE_VERSION) FROM HP_TRANSPORTER.CONTROL_FILES WHERE FILE_HOST = B.FILE_HOST AND FILE_ABS_PATH = B.
A Control File Examples Control File Examples Following are control file examples for: • Data types Example A-1 (page 86) • Extract from a Neoview SQL source Example A-2 (page 89) • Extract from a table source to a named pipe Example A-3 (page 91) • Load fixed-width data Example A-4 (page 93) • Include file for defining data formats Example A-5 (page 95) • Load from a file with an Include file Example A-6 (page 96) • Load and extract Example A-7 (page 97) • Multiple data formats and maps Example A-8 (page 9
Example A-1 Control File: Datatypes /* All Data Types */ version 1; options { errors = 1000, discards = 500, rowsetsize = 1000 } typeformats { # comment: show examples date_frmt time_frmt timestamp_frmt intv_frmt_00 intv_frmt_01 intv_frmt_02 intv_frmt_03 intv_frmt_04 intv_frmt_05 intv_frmt_06 intv_frmt_07 intv_frmt_08 intv_frmt_09 intv_frmt_10 intv_frmt_11 intv_frmt_12 of multiple date, time and interval formats date '%C%y-%m-%d' time '%H:%M:%s' datetime '%C%y-%m-%d %H:%M:%s' interval '%Y' interval '%Y-%M
intv10 interval format intv11 interval format intv12 interval format flot1 float, flot2 float intv_frmt_10, intv_frmt_11, intv_frmt_12, ) # comment: data format for the target SQL table on the Neoview platform */ dataformat2 sql fields ( smin1 smallint, smin2 smallint, inte1 integer, inte2 integer, lint1 largeint, nume1 numeric, nume2 numeric, deci1 decimal, char1 char, vchr1 varchar, date1 date, time1 time, time2 time, tims1 timestamp, tims2 timestamp, intv00 interval, intv02 interval, intv03 interval,
tims1 tims2 intv00 intv02 intv03 intv05 intv07 intv08 intv09 intv10 intv12 intv13 intv14 intv16 intv17 flot1 flot2 = = = = = = = = = = = = = = = = = tims1, tims2, intv00, intv02, intv03, intv05, intv07, intv08, intv09, intv10, intv12, intv13, intv14, intv16, intv17, flot1, flot2 ) } jobs { load01 load options ( parallel = "false" ) ( source src1 target tgt01 map map01 ) } 88 Control File Examples
Example A-2 Control File: Extract From Neoview SQL Source /*----------------------------------------*/ /* Simple Extract From Neoview SQL Source To File */ /*----------------------------------------*/ version 1.
LAST_NAME ADDRESS ZIP PHONE INFO1 INFO2 = = = = = = LAST_NAME, ADDRESS, ZIP, PHONE, INFO1, INFO2 ) } jobs { extract1 extract options ( parallel = "true" ) ( source src1 target tgt1 map map1 ) } 90 Control File Examples
Example A-3 Control File: Extract From Table Source to Named Pipe /*------------------------------------------------*/ /* Simple Extract From Table Source To Named Pipe */ /*------------------------------------------------*/ version 1; options { errors = 1000, rowsetsize = 5000, discards = 100 } dataformats { dataformat1 delimited records separated by fields delimited by <,> optionally qualified by <"> ( FIRST_NAME char, LAST_NAME char, ADDRESS char, ZIP int, PHONE char, SSN largeint, INFO1 char, INFO2
{ extract1 extract options ( parallel = "false" ) ( source src1 target tgt1 map map1 ) } 92 Control File Examples
Example A-4 Control File: Load Fixed Width Data /* Load Fixed Width Data */ version 1; options { errors = 1000, discards = 100, rowsetsize = 5000 } dataformats { # fixed width data source file dataformat1 fixed record length 443 fields positions ( 1, 13, 37, 165, 169, 179, 187, 315 ) ( FIRST_NAME char, LAST_NAME char, ADDRESS char, ZIP int, PHONE char, SSN largeint, INFO1 char, INFO2 char ) dataformat2 sql ( FIRST_NAME LAST_NAME ADDRESS ZIP PHONE SSN INFO1 INFO2 ) fields char, char, char, int, char, large
LAST_NAME ADDRESS ZIP PHONE SSN INFO1 INFO2 = = = = = = = LAST_NAME, ADDRESS, ZIP, PHONE, SSN, INFO1, INFO2 ) } jobs { load1 load options ( parallel = "false" ) ( source src1 target tgt1 map map1 ) } 94 Control File Examples
Example A-5 Control File: Include Defining Dataformats You can use an Include file such as this to define data formats for many control files. See Example A-6, in which this file, ./CF/include_part_2.cf has been included to define the data formats.
Example A-6 Control File: Include — Load From File to Table /* Simple Load From File To Table Using INCLUDE To Include Control File Fragment (File 1/2) */ version 1.0; # comment: control file fragment include_part_2.cf defines the data formats %include "./CF/include_part_2.cf" options { errors discards rowsetsize } = = = 1000, 100, 5000 sources { src1 file "./data_files/sn_nvt.dat" tgt1 jdbc table ( system user password datasource url ) NEO.nvt.
Example A-7 Control File: Load and Extract /* Simple Load From File To Table & Extract From Neoview SQL Source To File */ version 1.
PHONE SSN INFO1 INFO2 = = = = PHONE, SSN, INFO1, INFO2 ) map2 from dataformat2 to dataformat3 ( FIRST_NAME = FIRST_NAME, LAST_NAME = LAST_NAME, ADDRESS = ADDRESS, ZIP = ZIP, PHONE = PHONE, SSN = SSN, INFO1 = INFO1, INFO2 = INFO2 ) } # comment: define 2 separate jobs - one for load, one for extract # comment: note that load or extract is a job attribute, and cannot be combined into a single job jobs { load1 load options ( parallel = "false" ) ( source src1 target tgt1 map map1 ) extract1 extract options (
Example A-8 Control File: Multiple Dataformats and Maps /*-----------------------------*/ /* Multiple Dataformats & Maps */ /*-----------------------------*/ version 1.
( system user password url datasource = = = = = "asl0101", "user", "transport", "jdbc:hpt4jdbc://txnaslan01.txn.cpqcorp.net:18650/", "Admin_Load_DataSource" ) src2 file "./data_files/sn_nvt_sml.dat" tgt2 jdbc table NEO.nvt.sn_nvt_sml options ( system = "asl0101", user = "user", password = "transport", url = "jdbc:hpt4jdbc://txnaslan01.txn.cpqcorp.
Example A-9 Control File: Reflexive Update /*-------------------------*/ /* Simple Reflexive Update */ /*-------------------------*/ version 1.0; options { errors discards rowsetsize } = 1000, = 100, = 100 dataformats { dataformat1 delimited records separated by fields delimited by <,> optionally qualified by <"> ( C1 int, C2 int, C3 char ) dataformat2 sql fields ( C1 int, C2 int, C3 char ) } sources { src1 file "./data_files/sn_nvt_refu.dat" tgt1 jdbc table NEO.nvt.
Example A-10 Control File: Update with Constant, NULL, or Expression /*--------------------------------*/ /* Constants, NULLs & Expressions */ /*--------------------------------*/ version 1; options { errors = 1000, discards = 100, rowsetsize = 5000 } dataformats { dataformat1 delimited records separated by fields delimited by <,> optionally qualified by <"> ( FIRST_NAME char, LAST_NAME char, ADDRESS char, ZIP int, PHONE char, SSN largeint, INFO1 char, INFO2 char ) dataformat2 sql ( FIRST_NAME LAST_NAM
load1 load options ( parallel= "false" ) ( source src1 target tgt1 map map1 ) } 103
Example A-11 Control File: Identity column This example of defines an IDENTITY column. CUSTOMER_NUMBER can be any numeric data type (SMALLINT, INTEGER, LARGEINT) column.
Example A-12 Control File: Usage of NULLSTRING /*--------------------------------*/ /* Usage of NULLSTRING */ /*--------------------------------*/ version 1.
Example A-13 Control File: NOAUDIT mode /*---------------------------------------------------------------------------------------*/ /* A Single Load Job Using NOAUDIT Mode, A Single Source File, And A Single Target Table */ /*---------------------------------------------------------------------------------------*/ version 1; options { errors discards rowsetsize sorted noaudit } = = = = = 1000, 100, 5000, "true", "true" dataformats { dataformat1 delimited records separated by fields delimited by <,>
Example A-14 Control File: Forcestaging Load with Multiple Job Entries in a Single Job /*-------------------------------------------------------------------------------------*/ /* A Single Load Job Using Staging Tables And Multiple Job Entries Running In Parallel */ /*-------------------------------------------------------------------------------------*/ version 1; options { errors discards rowsetsize forcestaging sorted } = = = = = 1000, 100, 5000, "true", "true" dataformats { dataformat1 delimited reco
source src2 target tgt1 map map1 source src3 target tgt1 map map1 ) } 108 Control File Examples
Example A-15 Control File: Load From File to Table Using Subset of Source Columns *------------------------------------------------------------------------*/ /* Simple Load From File To Table With Only Subset Of Source Columns Used */ /*------------------------------------------------------------------------*/ /* DDL to be used to create target table CREATE TABLE NVT.
ADR_tgt ZP_tgt PH_tgt SOCIAL_tgt IN1_tgt = = = = = ADDRESS_src, ZIP_src, PHONE_src, SSN_src, INFO1_src ) } jobs { load1 load options ( parallel = "false" ) ( source src1 target tgt1 map map1 ) } 110 Control File Examples
Example A-16 Control File: Load File to Table Without 1 to 1 Mapping /*-----------------------------------------------------*/ /* Simple Load From File To Table With Non 1-1 Mapping */ /*-----------------------------------------------------*/ /* DDL to be used to create target table: CREATE TABLE NVT.
password datasource url = = = "dbt", "Admin_Load_DataSource", "jdbc:hpt4jdbc://
Example A-17 Control File: Simple Virtual Partitioned Load /*---------------------------------------------------*/ /* Simple Virtual Partitioned load /*---------------------------------------------------*/ version 1.0.0; options { errors = 100 ,discards = 100 ,rowsetsize = 10 ,commitsize = 20 ,parallelstreams = 2 ,system = "tra0101" ,user = "SUPERUSER" ,password = "HPNe@v1ew" ,datasource = "Admin_Load_DataSource" ,url = "jdbc:hpt4jdbc://tra0101.cup.hp.
} { load1 load options ( parallel = "false" ) ( source src1 target tgt1 map map1 source src1 target tgt2 map map1 ) } 114 Control File Examples
B Enhanced Initial Load Control File Examples These are control file examples for the Enhanced Initial Load (EIL) feature: • Example B-1 • Example B-2 • Example B-3 • Example B-4 (page 117) 115
Examples Example B-1 EIL: Specifying the inlinesort Control File Option src01 file "./data_files/sn_nvt_01.dat" options (inlinesort = “true”, …) Example B-2 EIL: Specifying the loadid Control File Option load02 load options (loadid = 20000 ) ( source src01 target tgt01 map map01 ) Example B-3 EIL: Specifying an Exception Table tgt01 jdbc table NEO.nvt.sn_nvt options ( exceptiontable = "myschema.
Example B-4 Enhanced Initial Load Sample Control File version 1.
C Control File Editor and Control File Generator This chapter addresses the following topics: • The “Control File Editor ” (page 119) • The “Control File Generator” (page 121) Control File Editor The Control File Editor is a Java-based GUI editor for creating, editing, and managing the Transporter control file. This tool can be used either on the loader platform that hosts the Transporter client component, or on your local workstation or PC to create, edit, and syntactically validate the control file.
If a control file is not syntactically correct, it opens in text mode only. Fix all syntax errors before viewing the file in GUI mode. See “Checking the Syntax of a Control File” (page 120). Adding Content to a Control File Click the appropriate tab to select the section of the control file you want to create or change. An Input Pane and Entered Input Table appear for each section in the file. Input Pane: the left pane where you create the control file.
Switching Views in the Control File Editor You can work in two views in the Control File Editor: • GUI view (default view) The GUI view assists you in creating, editing, and viewing your control file. Select one of eight tabs to create the eight components of a control file. • Text view Text view displays plain editable text where you can create, edit, and view your control file. Including Another Control File To include another control file: Click Insert —> Include or click the Ctrl+I.
—schema If specified with table, the schema information to put into the generated control file. If —schema is specified without -table, control files are generated for all tables within the schema that can be accessed without error. Delimited schema names must start and end with a double-quote (“). Names containing double quotes must be escaped with singe-quotes ('). —table specifies the table name or names to be used in generating a control file.
-srcfile Specifies the names of data files to be used in the sources section of a control file. The src-file-list is a comma-separated list of file names. The number of file names that you specify must match the number of tables that you specify for generation. The default value (file name) is ./datafiles/nvt_datafile.dat. -srcpipe Specifies the list of named pipes to be used in the sources section of a control file. The src-pipe-list is a comma-separated list of file names.
D Global Options Table Global Transporter options that are specified in the Options section of the control file can be overridden at other levels or sections of the file. This table shows the level at which the global setting can be overridden.
Table D-1 Global Options Table (continued) This Option Can be overridden at these levels “system” (page 64) X “taskrowsetsize” (page 64) X “teamsize” (page 65) “tenacity” (page 65) “timeout” (page 65) X X X X X X “truncate” (page 65) 126 X “url” (page 65) X X “user” (page 66) X X Global Options Table
E Reserved Keywords Transporter Reserved Keywords All control file options and section identifiers are reserved in a control file. All keywords are case-insensitive. While some of the keywords are not explicitly disallowed, use of the keywords listed is discouraged because their use can hamper upgrades to future releases of the Transporter product.
Table E-11 Reserved Transporter Keywords — M map maps maxlen maxparallelmediumjobs maxparallelsmalljobs multipart maxparalleljobs maxparallelbigjobs Table E-12 Reserved Transporter Keywords — N nl noaudit norollback nodelist null operation option nullstring Table E-13 Reserved Transporter Keywords — O odbc offload on optionally options output Table E-14 Reserved Transporter Keywords — P parallel paralleljobs parallelstreams password pipe positions postlaunch prelaunch program
F Transporter Exit Codes Table F-1 Transporter Exit Codes Code Number Error 0 Success 1 General failure. 2 NVTHOME environment variable is not defined or is invalid. 3 JAVA_HOME environment variable is not set for the JNI Bridge. 4 The control file passed doesn't exist or can't be accessed. 5 Invalid parameter specified on command line. 12 The Transporter Release directory layout is invalid or missing objects. 20 Error during control file parsing.
Glossary control file Input to Transporter. The control file specifies how you want to move data from source to target. Control File Generator A standalone tool that helps you create a control file(s) for a single table or all tables in a schema. Control File GUI A standalone GUI editor tool that creates, edits, and syntactically validates a Control File. data extractor The feature that allows data movement from the Neoview platform to a file, named pipe, or other data source.
Index A autotran, description of, 53 B baddatafile, description of, 53 byte integral data type, 51 byteint integral data type, 51 C char[acter] character data type, 51 client on Linux, 68 on windows, 68 starting, 67 stopping, 68 client software, 16 column IDENTITY, 22 update with a constant, 22 update with an expression, 22 update with NULL, 22 comments control file section, 50 commitsize, description of, 53 compression of data, 22 compression, description of, 53 control file comments section, 50 creating
character, 51 date/time, 51 floating point, 51 integral, 51 interval, 51 supported, 51 dataformats delimited format, 37 fixed field definitions , 38 fixed length format, 38 SQL table format, 39 dataformats control file section, 37 datasource, description of, 54 date date/time data type, 51 datetime date/time data type, 51 decimal integral data type, 51 deleteonerror, description of, 54 delimited data, 75, 76, 77 discards, description of, 54 Documents, related information, 9 double floating point data type,
for extract jobs, 69 for load jobs, 69 performance considerations, 21 with multiple targets, 69 parallel, description of, 62 parallelstreams, description of, 62 partitions virtual, 23 password, 25 encryption, 25 password, description of, 63 R real floating point data type, 51 reflexive update, 21 considerations, 22 mapping, 43 reserved keywords, 127 retries, description of, 63 rowsetsize, description of, 63 typeformats control file section, 31 U update column with a constant, 22, 43 with an expression, 2