User Guide

Chapter

Summarize

The Summarize procedure calculates subgroup statistics for variables within

categories

of one or more grouping variables. All levels of the grouping variable

are crosstabulated. You can choose the order in which the statistics are displayed.

Summary statistics for each variable across all categories are also displayed. Data

values in e

ach category can be listed or suppressed. With large data sets, you can

choose to list only the first n cases.

Example. What is the average product sales amount by region and customer industry?

You m ight

discover that the average sales amount is slightly higher in the western

region than in other regions, with corporate customers in the western region yielding

the highest average sales amount.

Statisti

cs.

Sum, number of cases, mean, median, grouped median, standard error

of the mean, minimum, maximum, range, variable value of the first category of

the grouping variable, variable value of the last category of the grouping variable,

standar

d deviation, variance, kurtosis, standard error of kurtosis, skewness, standard

error of skewness, percentage of total sum, percentage of total N, percentage of sum

in, percentage of N in, geometric mean, and harmonic mean.

Data. Gr

ouping variables are categorical variables whose values can be numeric

or short string. The number of categories should be reasonably small. The other

variables should be able to be ranked.

Assump

tions.

Some of the optional subgroup statistics, such as the mean and standard

deviation, are based on normal theory and are appropriate for quantitative variables

with symmetric distributions. Robust statistics, such as the median and the range,

are app

ropriate for quantitative variables that may or may not meet the assumption

of normality.

337