Article type
Year
Abstract
Background: Many prognostic studies use data-driven models: each independent factor is tested in a bi-variable analysis, and only those that show evidence of association (e.g. P value ≤ 0.05) are entered into an adjusted model. Others report data only for those independent factors in their adjusted model that show a significant association with the dependent variable. Systematic exclusion of these data presents a risk of overestimation by only pooling estimates of association from predictors that appear in adjusted regression models, and for whom data is provided.
Objectives: To investigate if imputation of missing non-significant data in final regression model will avoid overestimation of the predictive power of the risk factors, using predictors for persistent pain after breast cancer surgery as an example.
Methods: We pooled nine predictors to explore their association with the development of persistent pain following breast cancer surgery using random-effects models. We imputed an odds ratio (OR) of '1' for predictors that were excluded from adjusted analyses due to non-significant bi-variable analyses, or that were reported, but with no data due to lack of significance in the final regression model. We acquired the associated variance for all such imputations using the hot deck approach. We performed sensitivity analysis to examine the impact of imputing data for non-significant predictors excluded from adjusted analyses by re-running our analyses and excluding the imputed data.
Results: Fifty-nine (51.3%) out of 115 study-sets for nine predictors failed to reported the adjusted data for non-significant predictors. Our sensitivity analyses found no significant differences in results whether or not we incorporated missing data for non-significant predictors (Table 1). However, the associations of the nice predictors with persistent pain were consistently larger in meta-analyses based on the adjusted data only than in the full analyses including imputed missing data.
Conclusions: Imputation of missing data for non-significant predictors did not cause any significant associations to lose significance, but the magnitude of association was reduced.
Objectives: To investigate if imputation of missing non-significant data in final regression model will avoid overestimation of the predictive power of the risk factors, using predictors for persistent pain after breast cancer surgery as an example.
Methods: We pooled nine predictors to explore their association with the development of persistent pain following breast cancer surgery using random-effects models. We imputed an odds ratio (OR) of '1' for predictors that were excluded from adjusted analyses due to non-significant bi-variable analyses, or that were reported, but with no data due to lack of significance in the final regression model. We acquired the associated variance for all such imputations using the hot deck approach. We performed sensitivity analysis to examine the impact of imputing data for non-significant predictors excluded from adjusted analyses by re-running our analyses and excluding the imputed data.
Results: Fifty-nine (51.3%) out of 115 study-sets for nine predictors failed to reported the adjusted data for non-significant predictors. Our sensitivity analyses found no significant differences in results whether or not we incorporated missing data for non-significant predictors (Table 1). However, the associations of the nice predictors with persistent pain were consistently larger in meta-analyses based on the adjusted data only than in the full analyses including imputed missing data.
Conclusions: Imputation of missing data for non-significant predictors did not cause any significant associations to lose significance, but the magnitude of association was reduced.