A framework for individual participant data meta-analysis in the presence of missing data

Article type
Debray T1, Koffijberg H1, Jolani S2, Van Buuren S2, Moons KGM1
1Julius Center for Health Sciences and Primary Care, The Netherlands
2Department of Methodology and Statistics, Utrecht University, The Netherlands
Background: Individual participant data meta-analyses (IPD-MA) are an increasingly popular approach for developing multivariable risk prediction models. Recently, a framework was proposed to develop, implement and validate such models when baseline risks are heterogeneous across studies. Because this framework requires complete data to identify a homogeneous set of predictor effects, its implementation may be problematic when some predictor variables are systematically missing in one or more studies.

Objectives: To describe a strategy for developing a prediction model from an IPD-MA when some predictors are (systematically) missing in one of more studies.

Methods: The proposed strategy imputes missing data using a multilevel imputation model, and subsequently searches for a homogeneous set of predictors to ensure model generalisability. We compare the strategy to exclusion of studies that are affected by systematic missingness for important predictors.

Results: Results from a real-life example indicate that imputation leads to an improved model performance and smaller standard errors for estimated predictor effects. The exclusion of studies tends to identify other sets of predictors with a decreased model performance.

Conclusions: Our study demonstrates that an IPD-MA with systematically missing predictors does not need to discard studies or predictors. The generalisability of the resulting model is, however, not guaranteed to studies where some of the model predictors are systematically missing.