Abstract

Missing data problems have been a thorn in the side of prevention researchers for years. Although some solutions for these problems have been available in the statistical literature, these solutions have not found their way into mainstream prevention research. This chapter is meant to serve as an introduction to the systematic application of the missing data analysis solutions presented recently by Little and Rubin (1987) and others. The chapter does not describe a complete strategy, but it is relevant for (1) missing data analysis with continuous (but not categorical) data, (2) data that are reasonably normally distributed, and (3) solutions for missing data problems for analyses related to the general linear model in particular, analyses that use (or can use) a covariance matrix as input. The examples in the chapter come from drug prevention research. The chapter discusses (1) the problem of wanting to ask respondents more questions than most individuals can answer; (2) the problem of attrition and some solutions; and (3) the problem of special measurement procedures that are too expensive or time consuming to obtain for all subjects. The authors end with several conclusions: Whenever possible, researchers should use the Expectation-Maximization (EM) algorithm (or other maximum likelihood procedure, including the multiple-group structural equation-modeling procedure or, where appropriate, multiple imputation, for analyses involving missing data [the chapter provides concrete examples]); If researchers must use other analyses, they should keep in mind that these others produce biased results and should not be relied upon for final analyses; When data are missing, the appropriate missing data analysis procedures do not generate something out of nothing but do make the most out of the data available; When data are missing, researchers should work hard (especially when planning a study) to find the cause of missingness and include the cause in the analysis models; and Researchers should sample the cases originally missing (whenever possible) and adjust EM algorithm parameter estimates accordingly.

Keywords

Missing dataImputation (statistics)Categorical variableComputer scienceData miningData scienceMachine learning

Affiliated Institutions

Related Publications

Applied Missing Data Analysis

Part 1. An Introduction to Missing Data. 1.1 Introduction. 1.2 Chapter Overview. 1.3 Missing Data Patterns. 1.4 A Conceptual Overview of Missing Data heory. 1.5 A More Formal De...

2010 6888 citations

Much Ado About Nothing

Missing data are a recurring problem that can cause bias or lead to inefficient analyses. Development of statistical methods to address missingness have been actively pursued in...

2007 The American Statistician 759 citations

Publication Info

Year
1997
Type
article
Volume
142
Pages
325-366
Citations
269
Access
Closed

External Links

Social Impact

Social media, news, blog, policy document mentions

Citation Metrics

269
OpenAlex

Cite This

John W. Graham, Scott M. Hofer, Stewart I. Donaldson et al. (1997). Analysis with missing data in prevention research.. American Psychological Association eBooks , 142 , 325-366. https://doi.org/10.1037/10222-010

Identifiers

DOI
10.1037/10222-010