Abstract

When making sampling distribution inferences about the parameter of the data, θ, it is appropriate to ignore the process that causes missing data if the missing data are 'missing at random' and the observed data are 'observed at random', but these inferences are generally conditional on the observed pattern of missing data. When making direct-likelihood or Bayesian inferences about θ, it is appropriate to ignore the process that causes missing data if the missing data are missing at random and the parameter of the missing data process is 'distinct' from θ. These conditions are the weakest general conditions under which ignoring the process that causes missing data always leads to correct inferences.

Keywords

Missing dataImputation (statistics)InferenceMathematicsStatisticsConditional probability distributionProcess (computing)Data miningBayesian probabilityEconometricsComputer scienceArtificial intelligence

Affiliated Institutions

Related Publications

Publication Info

Year
1976
Type
article
Volume
63
Issue
3
Pages
581-592
Citations
9337
Access
Closed

External Links

Social Impact

Altmetric

Social media, news, blog, policy document mentions

Citation Metrics

9337
OpenAlex

Cite This

Donald B. Rubin (1976). Inference and missing data. Biometrika , 63 (3) , 581-592. https://doi.org/10.1093/biomet/63.3.581

Identifiers

DOI
10.1093/biomet/63.3.581