Imputation of systematically missing predictors in an individual participant data meta-analysis: a generalized approach using MICE

2015 Vienna

Debray T¹, Jolani S², Koffijberg H¹, van Buuren S², Moons K¹

¹Julius Center for Health Sciences and Primary Care, The Netherlands

²Utrecht University, The Netherlands

Background: Individual participant data meta-analyses (IPD-MA) are increasingly used for developing and validating multivariable (diagnostic or prognostic) risk prediction models. Unfortunately, some predictors or even outcomes may not have been measured in each study and are thus systematically missing in some individual studies of the IPD-MA. As a consequence, it is no longer possible to evaluate between-study heterogeneity and to estimate study-specific predictor effects, or to include all individual studies, which severely hampers the development and validation of prediction models.
Objectives: To describe a novel approach for imputing systematically missing data, which adopts a generalized linear mixed model to allow for between-study heterogeneity.
Methods: We illustrate our approach using a case study with IPD-MA of 13 studies to develop and validate a diagnostic prediction model for the presence of deep venous thrombosis. We compare the results after applying four methods for dealing with systematically missing predictors in one or more individual studies: complete case analysis (CCA) where studies with systematically missing predictors are removed, traditional multiple imputation ignoring heterogeneity across studies (TMI), stratified multiple imputation accounting for heterogeneity in predictor prevalence (SMI), and multilevel multiple imputation (MLMI) fully accounting for between-study heterogeneity.
Results: Results from CCA were suboptimal and became completely unreliable when the predictors were no longer missing completely at random. TMI and SMI tended to mask the actual degree of between-study heterogeneity and often lead to overoptimistic standard errors of predictor effects. MLMI was the optimal approach in terms of coverage and bias, and the only approach that was able to ensure compatibility of imputation and analysis models.
Conclusions: MLMI may substantially improve the estimation of between-study heterogeneity parameters and allow for imputation of systematically missing predictors in IPD-MA aimed at the development and validation of prediction models.