Observation or evidence? Are non-randomized studies really more generalizable than randomized trials?

Article type
Authors
Lee N, McDonagh M, Chan B
Abstract
Background: Clinicians and advocacy groups have expressed concerns regarding limited applicability or generalizability of results from randomized controlled trials (RCTs) in a systematic review of atypical antipsychotics (AAPs), conducted by the Drug Effectiveness Review Project, due to strict eligibility criteria. It was suggested that non-randomized studies (nonRCTs) be included in future updates since these studies provide more information on effectiveness and applicability than RCTs. A discussion of the quality and methodological limitations of nonRCTs included in a subsequent update has been presented by McDonagh et al. and can be found in the 2006 Cochrane Colloquium. Whether nonRCTs provide more information on applicability than RCTs remains unclear. Objectives: To assess whether significant differences in applicability characteristics between nonRCTs and RCTs are present. To characterize similarities or differences in applicability and to further develop our checklist for applicability. Methods: Trials and nonRCTs directly comparing AAPs for treatment of schizophrenia were identified. We classified studies as effectiveness or efficacy designs. Efficacy trials include RCTs and try to determine whether an intervention works under controlled conditions; whereas effectiveness studies (which include RCTs and nonRCTs) try to determine how well an intervention works under ordinary conditions over longer periods of time. A 31-item applicability checklist was created and applied. Descriptive statistics were used to make comparisons between and across different study designs. Qualitative analysis of nonparametric data will also be conducted to further examine differences in applicability characteristics across the design types. Results: Eighty-one publications were identified: 4 effectiveness RCTs, 33 nonRCTs, and 44 efficacy RCTs. Exploratory analysis of 3 effectiveness RCTs, 11 efficacy RCTs, and 11 nonRCTs suggests that effectiveness RCTs reported a greater number of applicability characteristics (mean 23, median 21, SD 4.4) than efficacy RCTs (mean 15.8, median 16, SD 3.4) or nonRCTs (mean 14.4, median 17, SD 5.8). Further statistical analyses will be conducted. Conclusions: Based on preliminary analyses, efficacy RCTs and nonRCTs did not differ in their applicability. Effectiveness RCTs were superior to both efficacy RCTs and nonRCTs. Final analysis will be completed by October 2008.