User Guide

477
K-Means Cluste
rAnalysis
Figure 33-2
K-Means Cluster Analysis dialog box
E Select the variables to be used in the cluster analysis.
E Specify the number of clusters. The number of clusters must be at least two and must
not be gr
eater than the number of cases in the data file.
E Select
either
Iterate and classify or Classify only.
Optionally, you can select an identification variable to label cases.
K-Means
Cluster Analysis Efficiency
The k-me
ans cluster analysis command is efficient primarily because it does not
compute the distances between all pairs of cases, as do many clustering algorithms,
including that used by the hierarchical clustering command.
For max
imum efficiency, take a sample of cases and use the Iterate and Classify
method to determine cluster centers. Select
Write final as File. Then restore the entire
data file and select Classify only as the method. Click Centers and click Read initial
from Fi
le
to classify the entire file using the centers estimated from the sample.