Neoview SQL Reference Manual (R2.5)
EVERY KEY
The EVERY KEY keyword indicates that histogram statistics are to be generated for
columns that make up the primary key and indexes. For example, table has columns
A, B, C, D defined. If the primary key comprises columns A, B, statistics are generated
for (A, B), A and B. If the primary key comprises columns A, B, C, statistics are generated
for (A,B,C), (A,B), A, B, C. If the primary key comprises columns A, B, C, D, statistics are
generated for (A, B, C, D), (A, B, C), (A, B), and A, B, C, D.
EXISTING COLUMN[S]
The EXISTING COLUMN keyword indicates that all existing histograms of the table are
to be updated. Statistics must be previously captured to establish existing columns.
NECESSARY COLUMN[S]
The NECESSARY COLUMN[S] keyword causes UPDATE STATISTICS to determine
what existing histograms, if any, need to be updated, then updates them. If no histograms
are found to need regeneration, UPDATE STATISTICS issues a warning. The NECESSARY
COLUMN[S] keyword generates statistics for histograms the optimizer has requested,
but do not exist. In all cases, automation must be enabled for NECESSARY COLUMN[S]
to generate statistics.
The NECESSARY COLUMN[S] keyword ignores the SAMPLE clause and internally
determines the sample percent to use.
histogram-option
GENERATE n INTERVALS
The GENERATE n INTERVALS option for UPDATE STATISTICS now accepts values
between 1 and 10,000. Keep in mind that increasing the number of intervals per
histograms may have a negative impact on compile time.
Increasing the number of intervals can be used for columns with small set of possible
values and large variance of the frequency of these values. For example, consider a
column ‘CITY’ in table SALES, which stores the city code where the item was sold,
where number of cities in the sales data is 1538. Setting the number of intervals to a
number greater or equal to the number of cities (that is, setting the number of intervals
to 1600) guarantees that the generated histogram captures the number of rows for
each city. If the specified value n exceeds the number of unique values in the column,
the system generates only as many intervals as there are unique values.
SAMPLE [sample-option]
is a clause that specifies that sampling is to be used to gather a subset of the data from
the table. UPDATE STATISTICS stores the sample results and generates histograms.
If you specify the SAMPLE clause without additional options, a random row sample
is used to read 1 percent of the rows in the table, with a maximum of 1 million rows.
If you do not specify the SAMPLE clause, table has fewer rows than specified, or the
sample size is greater than the system limit. Neoview SQL reads all rows from table.
See “SAMPLE Clause” (page 318).
sample-option
r rows
A row sample is used to read r rows from the table. The value r must be an integer
that is greater than or equal to zero (r > 0).
RANDOM percent PERCENT
directs Neoview SQL to choose rows randomly from the table. The value percent
must be a value between zero and 100 (0 < percent < 100). In addition, only the
first four digits to the right of the decimal point are significant. For example, value
0.00001 is considered to be 0.0000, Value 1.23456 is considered to be 1.2345.
206 SQL Statements