User Guide
Chapter
17
Summarize
The Summarize procedure calculates subgroup statistics for variables within
categories
of one or more grouping variables. All levels of the grouping variable
are crosstabulated. You can choose the order in which the statistics are displayed.
Summary statistics for each variable across all categories are also displayed. Data
values in e
ach category can be listed or suppressed. With large data sets, you can
choose to list only the first n cases.
Example. What is the average product sales amount by region and customer industry?
You m ight
discover that the average sales amount is slightly higher in the western
region than in other regions, with corporate customers in the western region yielding
the highest average sales amount.
Statisti
cs.
Sum, number of cases, mean, median, grouped median, standard error
of the mean, minimum, maximum, range, variable value of the first category of
the grouping variable, variable value of the last category of the grouping variable,
standar
d deviation, variance, kurtosis, standard error of kurtosis, skewness, standard
error of skewness, percentage of total sum, percentage of total N, percentage of sum
in, percentage of N in, geometric mean, and harmonic mean.
Data. Gr
ouping variables are categorical variables whose values can be numeric
or short string. The number of categories should be reasonably small. The other
variables should be able to be ranked.
Assump
tions.
Some of the optional subgroup statistics, such as the mean and standard
deviation, are based on normal theory and are appropriate for quantitative variables
with symmetric distributions. Robust statistics, such as the median and the range,
are app
ropriate for quantitative variables that may or may not meet the assumption
of normality.
337