Abstract

Abstract Multiple imputation by chained equations is a flexible and practical approach to handling missing data. We describe the principles of the method and show how to impute categorical and quantitative variables, including skewed variables. We give guidance on how to specify the imputation model and how many imputations are needed. We describe the practical analysis of multiply imputed data, including model building and model checking. We stress the limitations of the method and discuss the possible pitfalls. We illustrate the ideas using a data set in mental health, giving Stata code fragments. Copyright © 2010 John Wiley & Sons, Ltd.

Keywords

Imputation (statistics)Categorical variableComputer scienceMissing dataData miningMachine learning

Affiliated Institutions

Related Publications

Introduction to Econometrics

Foreword. Preface to the Second Edition. Preface to the Third Edition. Obituary. INTRODUCTION AND THE LINEAR REGRESSION MODEL. What is Econometrics? Statistical Background and M...

2020 WORLD SCIENTIFIC eBooks 3511 citations

Publication Info

Year
2010
Type
article
Volume
30
Issue
4
Pages
377-399
Citations
8772
Access
Closed

External Links

Social Impact

Social media, news, blog, policy document mentions

Citation Metrics

8772
OpenAlex

Cite This

Ian R. White, Patrick Royston, Angela Wood (2010). Multiple imputation using chained equations: Issues and guidance for practice. Statistics in Medicine , 30 (4) , 377-399. https://doi.org/10.1002/sim.4067

Identifiers

DOI
10.1002/sim.4067