Introduction: Clinical heterogeneity causes problems when trying to draw practical inferences from broadly inclusive systematic overviews of complex interventions. Conventional subgroup analysis may not completely address the needs of users of such overviews who frequently want to know how to maximise their chances of replicating a broadly effective intervention; i.e. to identify which subgroups have the most supporting evidence (weight of evidence; WoE), rather than identifying specific beneficial components.
Objective: To examine the implications of using the WoE supporting various intervention subgroups, to infer the subsequent effectiveness of these subgroups when tested in an independent dataset.
Methods: Data were obtained from 51 RCTs (23466 patients; 2157 outcome events) from 3 systematic overviews (stroke unit care, antihypertensive treatment in the elderly, and cardiac rehabilitation). The WoE supporting a subgroup was expressed as the proportion (%) of the total number of adverse events prevented (- [0-E]) within a statistical overview which were attributable to that subgroup. The WoE supporting various subgroups in overviews of early RCTs was then used to predict whether these sub-groups would have a "significant" result (1.96 SD of effect) in a separate test data set (overviews of RCTs published at a later date). A conventional approach to subgroup analysis, based on the statistical significance (z-score) of the subgroup results, was used for comparison.
Results: A total of 27 clinical subgroups were examined of which 13 were "significant" in the test data sets; 6 were previously supported by >75% WoE, 9 by >50% and only 1 by <25%. The receiver-operator characteristics were similar for both the WoE and conventional approaches.
Discussion: Although WoE is not a direct measure of effectiveness, it does appear to reflect the probability that a clinical subgroup will be effective.