Abstract

Three alternative procedures to adjust significance levels for multiplicity are the traditional Bonferroni technique, a sequential Bonferroni technique developed by Hochberg (1988) , and a sequential approach for controlling the false discovery rate proposed by Benjamini and Hochberg (1995). These procedures are illustrated and compared using examples from the National Assessment of Educational Progress (NAEP). A prominent advantage of the Benjamini and Hochberg (B-H) procedure, as demonstrated in these examples, is the greater invariance of statistical significance for given comparisons over alternative family sizes. Simulation studies show that all three procedures maintain a false discovery rate bounded above, often grossly, by α (or α/2). For both uncorrelated and pairwise families of comparisons, the B-H technique is shown to have greater power than the Hochberg or Bonferroni procedures, and its power remains relatively stable as the number of comparisons becomes large, giving it an increasing advantage when many comparisons are involved. We recommend that results from NAEP State Assessments be reported using the B-H technique rather than the Bonferroni procedure.

Keywords

Bonferroni correctionFalse discovery rateMultiple comparisons problemPairwise comparisonStatisticsMathematicsStatistical hypothesis testingWord error rateUncorrelatedType I and type II errorsComputer scienceArtificial intelligenceBiology

Affiliated Institutions

Related Publications

A Direct Approach to False Discovery Rates

Summary Multiple-hypothesis testing involves guarding against much more complicated errors than single-hypothesis testing. Whereas we typically control the type I error rate for...

2002 Journal of the Royal Statistical Soci... 5607 citations

Publication Info

Year
1999
Type
article
Volume
24
Issue
1
Pages
42-69
Citations
182
Access
Closed

External Links

Social Impact

Social media, news, blog, policy document mentions

Citation Metrics

182
OpenAlex

Cite This

Valerie Williams, Lyle V. Jones, John W. Tukey (1999). Controlling Error in Multiple Comparisons, with Examples from State-to-State Differences in Educational Achievement. Journal of Educational and Behavioral Statistics , 24 (1) , 42-69. https://doi.org/10.3102/10769986024001042

Identifiers

DOI
10.3102/10769986024001042