The reliability and validity of estimating unclearly reported blinding status in randomized clinical trials

Article type
Authors
Akl E1, Sun X2, Busse J2, Johnston B2, Briel M3, Mulla S2, You J2, Bassler D, Lamontagne F, Vera C, Alshurafa M, Katsios C, Mills E3, Guyatt G
1State University of New York at Buffalo, Buffalo, United States
2Clinical Epidemiology and Biostatistics, McMaster University, Hamilton, Ontario, Canada
3Basel Institute for Clinical Epidemiology, Basel, Switzerland
Abstract
Background: The Cochrane handbook classifies the risk of bias associated with blinding as: yes, no, or unclear. The usefulness of the rating could be enhanced if the currently unclear ratings could be accurately classified as probably yes or probably no. Objectives: To test the reliability and validity of classifying blinding, when unclearly reported in randomized trials, as likely or unlikely done. Methods: Following calibration exercises, two reviewers assessed the blinding of patients, providers, data collectors, outcome adjudicators, and data analysts in a duplicate and independent manner using a detailed instructions manual. The response options were: definitely yes, probably yes, probably no, and definitely no. After disagreement resolution, we attempted to contact authors for data verification. For each of the 5 questions, we assessed reliability by calculating kappa and weighted kappa using quadratic weights between the two reviewers for the 4 response categories and for 2 collapsed categories (probably yes merged with definitely yes, and probably no merged with definitely no). We assessed validity by calculating agreement between reviewers consensus and verified data. Results: Of 233 included reports, the percentage with unclear blinding status varied between 48.5% (patients) and 84.1% (data analysts). We obtained author verification for 46% of reports. Reliability was moderate for blinding of outcome adjudicators (κ=0.52) and data analysts (κ=0.42), and substantial for blinding of patients (κ=0.71), providers (κ=0.68) and data collectors (κ=0.65). Reliability improved when analyzing weighted agreement and collapsed categories (Table). Validity was moderate for blinding of data analysts (κ=0.42), almost perfect for blinding of patients (κ=0.96), data collectors (κ=0.93), and outcome adjudicators (κ=0.85), and perfect for blinding of providers (κ=1). Conclusions: With the possible exception of blinding of data analysts, use of probably yes and probably no can enhance the assessment of blinding in randomized trials.