Background: The 'Risk of bias' (RoB) tool was developed by Cochrane to evaluate RoB in randomized controlled trials (RCTs). Despite detailed guidance and consensus agreement, concern was raised about its reproducibility. However, there is uncertainty about the main causes of disagreement.
Objectives: To assess reproducibility of RoB assessment in RCTs included in more than one Cochrane systematic review (SR); to evaluate whether disagreements were related to different information or different interpretation; to explore the main reasons for disagreement.
Methods: We obtained data from 2796 Cochrane SRs published between March 2011 and September 2014. We identified single RCTs included in different SRs and compared RoB assessment for the domains of random sequence generation (RSG), allocation concealment (AC), blinding of participants and personnel (BPP), blinding of outcome assessment (BOA), incomplete outcome data. For each domain, we calculated the proportion of agreement; two reviewers independently analyzed and classified disagreements as related to different information (e.g. some review authors had access to additional information) or different interpretation (same support for judgement but different interpretation); finally, we identified the main causes of disagreement.
Results: We identified 782 RCTs included in more than one SR with RoB assessment for the same item. The proportion of agreement was 82% for RSG (597/730), 74% for AC (581/781), 70% for BPP (143/205) and 70% for BOA (204/292). For AC, 68% of disagreements (136/200) were between low and unclear RoB. Disagreements were related to different interpretations in 72% of cases (145/200) and different information in 18% (37/200). The main reasons for different interpretation were more or less strict assessment of the same support for judgement (e.g. 'sealed envelopes' considered either at low, unclear, or high RoB) in 36% (52/145), and confusion with another RoB domain (e.g. confusion between AC and RSG) for 15% (22/145).
Conclusions: Our study shows that RoB assessment reproducibility remains suboptimal. Some disagreements are related to additional information obtained by some review authors. Other main causes of disagreement include more or less severe assessment, and confusion between domains.
Healthcare consumer involvement: Consumers have not been directly involved in the study.