User Guide

Chapter
30
ChoosingaPr
ocedure for Clustering
Cluster analyses can be performed using the TwoStep, Hierarchical, or K-Means
Cluster Ana
lysis procedures. Each procedure employs a different algorithm for
creating clusters, and each has options not available in the others.
TwoStep Cluster Analysis. For many applications, the TwoStep Cluster Analysis
procedure will be the method of choice. It provides the following unique features:
Automatic selection of the best number of clusters, in addition to measures
for choosing between cluster models.
Ability to create cluster models simultaneously based on categorical and
continuou
svariables.
Ability t
o save the cluster model to an external XML file, then read that file and
update the cluster model using newer data.
Additiona
lly, the TwoStep Cluster Analysis procedure can analyze large data files.
Hierarchi
cal Cluster A nalysis.
The Hierarchical Cluster Analysis procedure is limited
to smaller data files (hundreds of objects to be clustered) but has the following
unique features:
Ability to cluster cases or variables.
Ability to compute a range of possible solutions and save cluster memberships
for each of those solutions.
Several methods for cluster formation, variable transformation, and measuring the
dissimil
arity between clusters.
As long as all the variables are of the same type, the Hierarchical Cluster Analysis
procedure can analyze interval (continuous), count, or binary variables.
453