The reliability and validity of estimating unclearly reported blinding status in randomized clinical trials

2010 Keystone

Akl E¹, Sun X², Busse J², Johnston B², Briel M³, Mulla S², You J², Bassler D, Lamontagne F, Vera C, Alshurafa M, Katsios C, Mills E³, Guyatt G

¹State University of New York at Buffalo, Buffalo, United States

²Clinical Epidemiology and Biostatistics, McMaster University, Hamilton, Ontario, Canada

³Basel Institute for Clinical Epidemiology, Basel, Switzerland

Background: The Cochrane handbook classifies the risk of bias associated with blinding as: yes, no, or unclear. The usefulness of the rating could be enhanced if the currently unclear ratings could be accurately classified as probably yes or probably no. Objectives: To test the reliability and validity of classifying blinding, when unclearly reported in randomized trials, as likely or unlikely done. Methods: Following calibration exercises, two reviewers assessed the blinding of patients, providers, data collectors, outcome adjudicators, and data analysts in a duplicate and independent manner using a detailed instructions manual. The response options were: definitely yes, probably yes, probably no, and definitely no. After disagreement resolution, we attempted to contact authors for data verification. For each of the 5 questions, we assessed reliability by calculating kappa and weighted kappa using quadratic weights between the two reviewers for the 4 response categories and for 2 collapsed categories (probably yes merged with definitely yes, and probably no merged with definitely no). We assessed validity by calculating agreement between reviewers consensus and verified data. Results: Of 233 included reports, the percentage with unclear blinding status varied between 48.5% (patients) and 84.1% (data analysts). We obtained author verification for 46% of reports. Reliability was moderate for blinding of outcome adjudicators (κ=0.52) and data analysts (κ=0.42), and substantial for blinding of patients (κ=0.71), providers (κ=0.68) and data collectors (κ=0.65). Reliability improved when analyzing weighted agreement and collapsed categories (Table). Validity was moderate for blinding of data analysts (κ=0.42), almost perfect for blinding of patients (κ=0.96), data collectors (κ=0.93), and outcome adjudicators (κ=0.85), and perfect for blinding of providers (κ=1). Conclusions: With the possible exception of blinding of data analysts, use of probably yes and probably no can enhance the assessment of blinding in randomized trials.

PDF

9515-9488.pdf