Multiple imputation of systematically missing predictors in an individual participant data meta-analysis: a generalized approach using MICE

Article type
Authors
Debray T1, Jolani S2, Koffijberg H1, van Buuren S3, Moons K1
1University Medical Center Utrecht, The Netherlands
2Utrecht University, The Netherlands
3TNO Quality of Life, The Netherlands
Abstract
Background:
Individual participant data meta-analyses (IPD-MA) are increasingly used for developing and validating multivariable (diagnostic or prognostic) risk prediction models. Unfortunately, some predictors or even outcomes may not have been measured in each study and are thus systematically missing in the IPD-MA. As a consequence, it is no longer possible to evaluate between-study heterogeneity and to estimate study-specific predictor effects, which severely hampers the development and/or validation of novel prediction models.

Methods:
Here we describe a novel approach for imputing systematically missing data and adopt a generalized linear mixed model to allow for between-study heterogeneity. This approach can be viewed as an extension of Resche-Rigon's method (Stat Med 2012), but relaxes assumptions regarding variance components and allows imputation of linear (e.g. continuous) and non-linear (e.g. categorical) predictors.

Results:
We illustrate our approach in a case study with the IPD from 13 studies for predicting the presence of deep venous thrombosis. We compare the results after applying various imputation methods, and make recommendations about their implementation.

Conclusions:
Our approach improves the estimation of predictor effects and between-study heterogeneity, thereby facilitating the development and validation of novel prediction models from an IPD-MA.