Abstract
Multivariable regression models are powerful tools that are used frequently in studies of clinical outcomes. These models can use a mixture of categorical and continuous variables and can handle partially observed (censored) responses. However, uncritical application of modelling techniques can result in models that poorly fit the dataset at hand, or, even more likely, inaccurately predict outcomes on new subjects. One must know how to measure qualities of a model's fit in order to avoid poorly fitted or overfitted models. Measurement of predictive accuracy can be difficult for survival time data in the presence of censoring. We discuss an easily interpretable index of predictive discrimination as well as methods for assessing calibration of predicted survival probabilities. Both types of predictive accuracy should be unbiasedly validated using bootstrapping or cross-validation, before using predictions in a new data series. We discuss some of the hazards of poorly fitted and overfitted regression models and present one modelling strategy that avoids many of the problems discussed. The methods described are applicable to all regression models, but are particularly needed for binary, ordinal, and time-to-event outcomes. Methods are illustrated with a survival analysis in prostate cancer using Cox regression.
Keywords
Affiliated Institutions
Related Publications
R-Squared Measures for Count Data Regression Models with Applications to Health-Care Utilization
For regression models other than the linear model, R-squared type goodness-to-fit summary statistics have been constructed for particular models using a variety of methods. The ...
MCMC Methods for Multi-Response Generalized Linear Mixed Models: The<b>MCMCglmm</b><i>R</i>Package
Generalized linear mixed models provide a flexible framework for modeling a range of data, although with non-Gaussian response variables the likelihood cannot be obtained in clo...
Structural equation modeling in practice: A review and recommended two-step approach.
In this article, we provide guidance for substantive researchers on the use of structural equation modeling in practice for theory testing and development. We present a comprehe...
Model Uncertainty, Data Mining and Statistical Inference
This paper takes a broad, pragmatic view of statistical inference to include all aspects of model formulation. The estimation of model parameters traditionally assumes that a mo...
Analysis of Longitudinal Data
1. Introduction 2. Design considerations 3. Exploring longitudinal data 4. General linear models 5. Parametric models for covariance structure 6. Analysis of variance methods 7....
Publication Info
- Year
- 1996
- Type
- review
- Volume
- 15
- Issue
- 4
- Pages
- 361-387
- Citations
- 9497
- Access
- Closed
External Links
Social Impact
Social media, news, blog, policy document mentions
Citation Metrics
Cite This
Identifiers
- DOI
- 10.1002/(sici)1097-0258(19960229)15:4<361::aid-sim168>3.0.co;2-4