Abstract

Abstract The problem of testing a point null hypothesis (or a "small interval" null hypothesis) is considered. Of interest is the relationship between the P value (or observed significance level) and conditional and Bayesian measures of evidence against the null hypothesis. Although one might presume that a small P value indicates the presence of strong evidence against the null, such is not necessarily the case. Expanding on earlier work [especially Edwards, Lindman, and Savage (1963) and Dickey (1977)], it is shown that actual evidence against a null (as measured, say, by posterior probability or comparative likelihood) can differ by an order of magnitude from the P value. For instance, data that yield a P value of .05, when testing a normal mean, result in a posterior probability of the null of at least .30 for any objective prior distribution. ("Objective" here means that equal prior weight is given the two hypotheses and that the prior is symmetric and nonincreasing away from the null; other definitions of "objective" will be seen to yield qualitatively similar results.) The overall conclusion is that P values can be highly misleading measures of the evidence provided by the data against the null hypothesis.

Keywords

Null hypothesisMathematicsNull (SQL)Statistical hypothesis testingStatisticsEconometricsComputer scienceData mining

Affiliated Institutions

Related Publications

Testing Precise Hypotheses

Testing of precise (point or small interval) hypotheses is reviewed, with special emphasis placed on exploring the dramatic conflict between conditional measures (Bayes factors ...

1987 Statistical Science 676 citations

Bayes Factors

Abstract In a 1935 paper and in his book Theory of Probability, Jeffreys developed a methodology for quantifying the evidence in favor of a scientific theory. The centerpiece wa...

1995 Journal of the American Statistical A... 11631 citations

A Direct Approach to False Discovery Rates

Summary Multiple-hypothesis testing involves guarding against much more complicated errors than single-hypothesis testing. Whereas we typically control the type I error rate for...

2002 Journal of the Royal Statistical Soci... 5607 citations

Publication Info

Year
1987
Type
article
Volume
82
Issue
397
Pages
112-122
Citations
777
Access
Closed

External Links

Social Impact

Social media, news, blog, policy document mentions

Citation Metrics

777
OpenAlex

Cite This

James O. Berger, Thomas Sellke (1987). Testing a Point Null Hypothesis: The Irreconcilability of<i>P</i>Values and Evidence. Journal of the American Statistical Association , 82 (397) , 112-122. https://doi.org/10.1080/01621459.1987.10478397

Identifiers

DOI
10.1080/01621459.1987.10478397