User's Manual

ManualsBrandsIBM ManualsSwitchIBM IBM Switch 15

Chapter 5

reduce network trafﬁc and speed stream operatio ns. Note that the Generate SQL check bo x

must be selected for SQL optimization to have any effect.



Optimize syntax execution. This method of stream rewr iting increases the efﬁciency of

operations that incorporate more than one n ode containing IB M® SPSS® Statistics s yntax.

Optimization is achieved by combining the syntax commands into a single operation, instead

of running eac h as a separate operation.



Optimize other execution. This method of stream rewriting increases th e efﬁciency of

operations that cannot be dele gated to the database. Optimization is achieved by reducing the

amount of data in the stream as early as possible. While maintaining data integrity, the s tream

is rewritte n to push operations close r to the data source, t hus redu cing data downstream for

costly oper

ations, such as joins.

Enable parallel processing. When running on a computer with multiple processors, this option

allows the system to balance th e load acros s those proces sors, which may result in faster

performanc

e. Use of multiple nodes or use of the following individual nodes may beneﬁt from

parallel processing: C5.0, Merge (by key), Sort, Bin (rank and tile methods), and Aggregate

(using one or mor e key ﬁelds).

Generate S

QL. Select this option to enable SQL generation, allowing stream o pe r ations to be push ed

back to the database by us ing SQL code to gene r ate execution processes, w hich may imp r ove

performance. To further improve performance,

Optimize SQL generation can also be selecte d to

maximiz e t

he number of operations pushed back to the database. When operations for a node have

been pushed back to the database, the node will be highlighted in purple when th e stream is run.



Database caching. For streams that generate SQL to be executed in the database, data c an be

cached mi

dstream to a temporary table in the data base rather than to the ﬁle system. When

combined with SQL optimization, this may result in signiﬁcant gains in performance. For

example, the output from a stream that merges multiple tables to create a data mining view

may be cached a nd reused a s needed. With d atabase caching enabled, sim ply right-click any

nonterminal node to cache data at that point, and the cache is automatically crea ted directly in

the database the next time the stream is run. This allows SQL to be generated for downstream

nodes, f

urther improving performance. Alternatively, this option can be disabled if needed,

such as when policies or permissions preclude data being written to the database. If database

caching or SQL optimization is not enabled, the cache w ill be written to the ﬁle system

instead. F or more information , se e the topic Caching Opt ions for Nodes on p. 50.



Use re

laxed conversion. This option enables the conversion of data from either strings to

numbers, or numbe r s to strings, if s tored in a suitable format. For example, if the data is

kept in the database as a string, but actually contains a meaningful number, the d ata can be

converted for use when the pushback occurs.

Note: Due to minor differences in SQL impleme ntation, streams run in a database may return

slightly di fferent results from those returned when run in SPSS Modeler. F or similar reas ons, these

differences may also vary dependin g on the database vendor.

Save As Default. The options speciﬁed apply only to the current stream. Click this button to set

these options as the default for all streams.