Abstract

This article describes and evaluates a procedure for imputing missing values for a relatively complex data structure when the data are missing at random. The imputations are obtained by fitting a sequence of regression models and drawing values from the corresponding predictive distributions. The types of regression models used are linear, logistic, Poisson, generalized logit or a mixture of these depending on the type of variable being imputed. Two additional common features in the imputation process are incorporated: restriction to a relevant subpopulation for some variables and logical bounds or constraints for the imputed values. The restrictions involve subsetting the sample individuals that satisfy certain criteria while fitting the regression models. The bounds involve drawing values from a truncated predictive distribution. The development of this method was partly motivated by the analysis of two data sets which are used as illustrations. The sequential regression procedure is applied to perform multiple imputation analysis for the two applied problems. The sampling properties of inferences from multiply imputed data sets created using the sequential regression method are evaluated through simulated data sets.

Keywords

Missing dataImputation (statistics)StatisticsMathematicsLogistic regressionMultivariate statisticsRegression analysisRegressionComputer science

Related Publications

Multiple Imputation of Missing Values

Following the seminal publications of Rubin about thirty years ago, statisticians have become increasingly aware of the inadequacy of “complete-case” analysis of datasets with m...

2004 The Stata Journal Promoting communica... 2310 citations

Publication Info

Year
2001
Type
article
Volume
27
Pages
85-95
Citations
1994
Access
Closed

External Links

Citation Metrics

1994
OpenAlex

Cite This

Trivellore E. Raghunathan, James M. Lepkowski, John Van Hoewyk et al. (2001). A multivariate technique for multiply imputing missing values using a sequence of regression models. Survey methodology , 27 , 85-95.