A SIMULATION STUDY OF CROSS-VALIDATION FOR SELECTING AN OPTIMAL CUTPOINT IN UNIVARIATE SURVIVAL ANALYSIS

David Faraggi; Richard Simon

doi:10.1002/(sici)1097-0258(19961030)15:20<2203::aid-sim357>3.0.co;2-g

Abstract

Continuous measurements are often dichotomized for classification of subjects. This paper evaluates two procedures for determining a best cutpoint for a continuous prognostic factor with right censored outcome data. One procedure selects the cutpoint that minimizes the significance level of a logrank test with comparison of the two groups defined by the cutpoint. This procedure adjusts the significance level for maximal selection. The other procedure uses a cross-validation approach. The latter easily extends to accommodate multiple other prognostic factors. We compare the methods in terms of statistical power and bias in estimation of the true relative risk associated with the prognostic factor. Both procedures produce approximately the correct type I error rate. Use of a maximally selected cutpoint without adjustment of the significance level, however, results in a substantially elevated type I error rate. The cross-validation procedure unbiasedly estimated the relative risk under the null hypothesis while the procedure based on the maximally selected test resulted in an upward bias. When the relative risk for the two groups defined by the covariate and true changepoint was small, the cross-validation procedure provided greater power than the maximally selected test. The cross-validation based estimate of relative risk was unbiased while the procedure based on the maximally selected test produced a biased estimate. As the true relative risk increased, the power of the maximally selected test was about 10 per cent greater than the power obtained using cross-validation. The maximally selected test overestimated the relative risk by about 10 per cent. The cross-validation procedure produced at most 5 per cent underestimation of the true relative risk. Finally, we report the effect of dichotomizing a continuous non-linear relationship between covariate and risk. We compare using a linear proportional hazard model to using models based on optimally selected cutpoints. Our simulation study indicates that we can have a substantial loss of statistical power when we use cutpoint models in cases where there is a continuous relationship between covariate and risk.

Keywords

UnivariateStatisticsUnivariate analysisComputer scienceSurvival analysisCross-validationMultivariate analysisMathematicsMultivariate statistics

Affiliated Institutions

National Cancer Institute US

Related Publications

Confidence intervals for the effect of a prognostic factor after selection of an ‘optimal’ cutpoint

Norbert Holländer , Willi Sauerbrei , Martin Schumacher

Abstract When investigating the effects of potential prognostic or risk factors that have been measured on a quantitative scale, values of these factors are often categorized in...

2004 Statistics in Medicine 116 citations

On the Adaptive Control of the False Discovery Rate in Multiple Testing With Independent Statistics

Yoav Benjamini , Yosef Hochberg

A new approach to problems of multiple significance testing was presented in Benjamini and Hochberg (1995), which calls for controlling the expected ratio of the number of erron...

2000 Journal of Educational and Behavioral... 1600 citations

Screening large-scale association study data: exploiting interactions using random forests

Kathryn L. Lunetta , Brooke Hayward , Jonathan Segal +1 more

Abstract Background Genome-wide association studies for complex diseases will produce genotypes on hundreds of thousands of single nucleotide polymorphisms (SNPs). A logical fir...

2004 BMC Genetics 460 citations

Meta-Analysis: A Comparison of Approaches

Ralf Schulze

Preface Introduction Theory: Statistical Methods of Meta-Analysis Effect Sizes Families of Effect Sizes The r Family: Correlation Coefficients as Effect Sizes The d Family: Stan...

2004 198 citations

A Simple Sequentially Rejective Multiple Test Procedure

Sture Holm

This paper presents a simple and widely ap- plicable multiple test procedure of the sequentially rejective type, i.e. hypotheses are rejected one at a tine until no further reje...

1979 Scandinavian Journal of Statistics 21731 citations

Publication Info

Year: 1996
Type: article
Volume: 15
Issue: 20
Pages: 2203-2213
Citations: 143
Access: Closed

External Links

View on DOI.org

Social Impact

Altmetric

A SIMULATION STUDY OF CROSS-VALIDATION FOR SELECTING AN OPTIMAL CUTPOINT IN UNIVARIATE SURVIVAL ANALYSIS

PlumX Metrics

Social media, news, blog, policy document mentions

Citation Metrics

143

OpenAlex

Cite This

APA Style

                            
                                    David Faraggi, 
                                
                                    Richard Simon
                                
                            (1996). 
                            A SIMULATION STUDY OF CROSS-VALIDATION FOR SELECTING AN OPTIMAL CUTPOINT IN UNIVARIATE SURVIVAL ANALYSIS. 
                            Statistics in Medicine
                            , 15
                            (20)
                            , 2203-2213.
                            https://doi.org/10.1002/(sici)1097-0258(19961030)15:20<2203::aid-sim357>3.0.co;2-g

Identifiers

DOI: 10.1002/(sici)1097-0258(19961030)15:20<2203::aid-sim357>3.0.co;2-g