Can we trust the conclusion in peer-review publications? A review of quality improvement intervention studies

Article type
Authors
Li, L, Moja L, Romero A, Grimshaw J
Abstract
Objective: To assess the appropriateness of conclusions reported in recently published quality improvement (QI) intervention studies in relation to the study design.

Design: We hand-searched 11 major medical journals or health services research journals for randomized and non-randomized evaluations of QI interventions (RCTs, non-RCTs) published between January 2002 and December 2003. Eligible studies were those evaluating interventions that aimed to change health professional behaviours based on research evidence. Two reviewers extracted data for each trial, including study characteristics and all statements addressing the causal effect between the intervention and outcomes. A 38-member clinical epidemiology international panel rated each statement on a Likert scale (range 1-7, higher = stronger causal relationship), assuming that all quotes were from well designed RCTs. In studies where more than one quote was extracted, only the highest score was used since the strongest causal statement provided the most definitive message. Student's t-tests were used to compare ratings for RCTs and non-RCTs.

Results: Of the 4543 titles hand-searched, 73 articles were included (RCTs=38; non-RCTs=35) and 207 causality quotes were extracted (abstract=68; main text=139). 7 studies had no abstract quote and 5 had no main text quote. Hence, ratings of 66 abstract quotes (RCTs=34; non-RCTs=32) and 68 main text quotes (RCTs=34; non-RCTs=34) were analysed. Ratings were received from 34 panelists (response rate=89.5%). Among the abstract quotes, the mean causality rating was significantly higher in the non-RCTs (5.10+-1.10, versus 4.08+-1.58 in RCTs; p=0.04). A similar trend was found in the main text quotes (RCTs = 4.63 +- 1.60; non-RCTs = 5.23 +- 1.27; p=0.09). In the subgroup analysis, a statistically significant difference was only found in studies reporting no effects or mixed results in the abstracts (RCTs = 3.47 +- 1.31; non-RCT = 4.54 +- 1.22; p=0.016).

Conclusions: Our results suggest that non-RCTs evaluating QI interventions might have overstated the strength of causality in their abstracts and main text.