A comparison of internal validation methods for validating predictive models for binary data with rare events

In clinical research, prediction models for binary data are frequently developed
in logistic regression framework to predict the risk of patient’s health status such
as death and illness. However, when the outcome is rare, the maximum likeli-
hood (ML) based standard logistic regression has been reported to show poor
predictive performance by providing over fitted model. To overcome this, penal-
ized maximum likelihood (PML) based logistic models are being widely used in
risk prediction, however, their predictive performance in validation settings is
not well-documented. Several validation approaches, namely split-sample, cross-
validation, bootstrap validation and its two variants 0.632 and 0.632+, have been
widely used to validate the performance of a prediction model, however, it is also
unclear which one of these approaches best for estimating accurate predictive
performance of a rare-outcome model. This paper focused on evaluating pre-
dictive performance of PML based logistic model in such validation settings in
comparison with ML based standard model and identifying the effective valida-
tion method. An extensive simulation study was performed by creating several
scenarios to re ect modeling in dataset with few events. The results revealed that
PML based model showed better performance by reducing over tting to some ex-
tent and increasing discriminatory ability over ML based model, irrespective of
validation methods under study. Of the validation methods, regular bootstrap
and its variants 0.632 and 0.632+, particularly 0.632+, performed well by provid-
ing nearly accurate and stable estimate of the true predictive performance. We
also illustrated the methods applying them to cardiac data set with few events.