User Guide

477

K-Means Cluste

rAnalysis

Figure 33-2

K-Means Cluster Analysis dialog box

E Select the variables to be used in the cluster analysis.

E Specify the number of clusters. The number of clusters must be at least two and must

not be gr

eater than the number of cases in the data file.

E Select

either

Iterate and classify or Classify only.

Optionally, you can select an identification variable to label cases.

K-Means

Cluster Analysis Efficiency

The k-me

ans cluster analysis command is efficient primarily because it does not

compute the distances between all pairs of cases, as do many clustering algorithms,

including that used by the hierarchical clustering command.

For max

imum efficiency, take a sample of cases and use the Iterate and Classify

method to determine cluster centers. Select

Write final as File. Then restore the entire

data file and select Classify only as the method. Click Centers and click Read initial

from Fi

to classify the entire file using the centers estimated from the sample.