Aggregating published prediction models with individual patient data: A comparison of different approaches

2011 Madrid

Debray T¹, Koffijberg H¹, Vergouwe Y², Steyerberg E³, Moons K¹

¹Julius Center for Health Sciences and Primary Care, University Medical Center Utrecht, Netherlands

²Julius Center for Health Sciences and Primary Care, University Medical Center Utrecht, Netherlands- Department of Public Health, Erasmus MC, Netherlands

³Department of Public Health, Erasmus MC, Netherlands

Background: During recent decades, interest in prediction models has substantially increased, but approaches to synthesize evidence from previously developed models have failed to keep pace. This causes researchers to ignore potentially useful past evidence when developing a novel prediction model with individual patient data (IPD) from their population of interest.

Objectives: We aimed to evaluate approaches to aggregate previously published prediction models. We consider the situation that models are reported in the literature with predictors similar to those available in an IPD dataset. Two approaches calculate an overall summary model, similar to a typical meta-analysis. A third approach employs a Bayesian perspective to focus on the IPD, using the principles of random-effects meta-analysis with penalization to stabilize the between-study covariance of previously published regression coefficients.

Methods: The performance of the approaches is examined in a simulation study with different degrees of between-study heterogeneity. The approaches are further applied to a collection of 15 datasets of patients with Traumatic Brain Injury, where we aimed to predict 6 month outcome.

Results: With no or weak heterogeneity, each of the 3 aggregation methods led to substantially better performance than ignoring previously developed models. With moderate heterogeneity, the Bayesian approach was preferable. Using only the IPD was only optimal under strong heterogeneity or a large sample size relative to the literature.

Conclusions: The incorporation of previously published prediction models into the development of a novel prediction model with a similar set of predictors is both feasible and beneficial when IPD are available. However, it remains paramount that researchers identify to what extent the previously published prediction models are comparable with those in the available IPD, as the justification of the considered approaches depends on the clinical relevance of the aggregated model. Future research may therefore focus on the quantification of heterogeneity across prediction models.