User Guide

474
Chapter 33
initial cluster centers and not using the Use running means option will avoid issues
related to case order. However, ordering of the initial cluster centers may affect the
solution, i
f there are tied distances from cases to cluster centers. Comparing results
from analyses with different permutations of the initial center values may be used to
assess the stability of a given solution.
Assumption
s.
Distances are computed using simple Euclidean distance. If you want
to use another distance or similarity measure, use the Hierarchical Cluster Analysis
procedure. Scaling of variables is an important consideration—if your variables are
measured o
n different scales (for example, one variable is expressed in dollars and
another is expressed in years), your results may be misleading. In such cases, you
should consider standardizing your variables before you perform the k-means cluster
analysis
(this can be done in the Descriptives procedure). The procedure assumes that
you have selected the appropriate number of clusters and that you have included all
relevantvariables.Ifyouhavechosenaninappropriate number of clusters or omitted
importa
nt variables, your results may be misleading.
Figure 33-1
K-means cluster analysis output
-1.88606 -1.54314 1.45741 .55724
-3.52581 -1.69358 .62725 .99370
-2.89320 -1.65146 -.51770 .88601
.93737 .16291 3.03701 -1.12785
4.16813 1.38422 -.69589 -.88983
2.68796 .42699 .33278 -1.08033
4.41517 .63185 -1.89037 .63185
-1.99641 -1.78455 .53091 1.22118
-.52182 -.31333 4.40082 -.99285
2.24070 .75481 .46008 -.76793
.24626 2.65246 -1.29624 -.74406
ZURBAN
ZLIFEEXP
ZLITERAC
ZPOP_INC
ZBABYMOR
ZBIRTH_R
ZDEATH_R
ZLOG_GDP
ZB_TO_D
ZFERTILT
ZLOG_POP
1 2 3 4
Cluster
Init ial Cluster Centers