User Guide

Chapter 1

Hochberg’s GT2 is similar to Tukey’s honestly significant difference test, but the

Studentized maximum modulus is used. Usually, Tukey’s test is more powerful.

Gabriel’s pairwise comparisons test also uses the Studentized maximum modulus

and is generally more powerful than Hochberg’s GT2 when the cell sizes are unequal.

Gabriel’s test may become liberal when the cell sizes vary greatly.

Dunnett’s pairwise multiple comparison t test compares a set of treatments

against a single control mean. The last category is the default control category.

Alternatively, you can choose the first category. You can also choose a two-sided or

one-sided test. To test that the mean at any level (except the control category) of the

factor is not equal to that of the control category, use a two-sided test. To test whether

the mean at any level of the factor is smaller than that of the control category, select

< Control. Likewise, to test whether the mean at any level of the factor is larger than

that of the control category, select

> Control.

Ryan, Einot, Gabriel, and Welsch (R-E-G-W) developed two multiple step-down

range tests. Multiple step-down procedures first test whether all means are equal. If all

means are not equal, subsets of means are tested for equality. R-E-G-W F is based on an

F test and R-E-G-W Q is based on the Studentized range. These tests are more powerful

than Duncan’s multiple range test and Student-Newman-Keuls (which are also multiple

step-down procedures), but they are not recommended for unequal cell sizes.

When the variances are unequal, use Tamhane’s T2 (conservative pairwise

comparisons test based on a t test), Dunnett’s T3 (pairwise comparison test based on

the Studentized maximum modulus), Games-Howell pairwise comparison test

(sometimes liberal), or Dunnett’s C (pairwise comparison test based on the

Studentized range).

Duncan’s multiple range test, Student-Newman-Keuls (S-N-K), and Tukey’s-b

are range tests that rank group means and compute a range value. These tests are not

used as frequently as the tests previously discussed.

The Waller-Duncan t test uses a Bayesian approach. This range test uses the

harmonic mean of the sample size when the sample sizes are unequal.

The significance level of the Scheffé test is designed to allow all possible linear

combinations of group means to be tested, not just pairwise comparisons available in

this feature. The result is that the Scheffé test is often more conservative than other

tests, which means that a larger difference between means is required for significance.

The least significant difference (LSD) pairwise multiple comparison test is

equivalent to multiple individual t tests between all pairs of groups. The disadvantage

of this test is that no attempt is made to adjust the observed significance level for

multiple comparisons.