Selective cutoff reporting in depression screening accuracy studies: a comparison of meta-analysis of published cutoffs only versus all cutoffs

Article type
Authors
Neupane D1, Levis B1, Bhandari PM1, Thombs BD1, Benedetti A1, DEPRESSD Collaboration NA2
1McGill University
2N/A
Abstract
Background: Selectively reporting accuracy results from only well-performing cutoffs would be expected to result in biased accuracy estimates in meta-analyses. It is not known whether the extent of bias differs depending on the availability of a well-defined standard cutoff.

Objectives: We compared (1) bias in accuracy estimates and (2) cutoff reporting patterns in studies on the diagnostic accuracy of the Patient Health Questionnaire-9 (PHQ-9; well-defined standard cutoff of ≥ 10) and the Edinburgh Postnatal Depression Scale (EPDS; no standard cutoff, common cutoffs = ≥ 10 to ≥ 13).

Methods: We analyzed a subset of datasets from two separate individual participant data meta-analyses (IPDMAs) on PHQ-9 and EPDS accuracy for screening to detect major depression. Separately, for the PHQ-9 and EPDS, we used bivariate random effects meta-analysis to compare accuracy estimates based on published cutoffs only versus all cutoffs from all studies. To assess cutoff reporting patterns, we compared the number of published cutoffs below and above the standard cutoff (or common range) when the study-specific optimal cutoff was lower or higher than the standard cutoff (or common range).

Results: Compared to IPDMA of all cutoffs, PHQ-9 sensitivity estimates based on published cutoffs only were underestimated for cutoffs below ≥ 10 and overestimated for cutoffs above ≥ 10 (median differences: -0.06 and 0.07). EPDS sensitivity estimates were similar for cutoffs below ≥ 10 but higher for cutoffs above ≥ 13 (median differences: 0.01 and 0.14). PHQ-9 studies with optimal cutoffs below ≥ 10 reported more cutoffs below ≥ 10 and those with optimal cutoffs above ≥ 10 reported more cutoffs above ≥ 10 (mean cutoffs: 8.8 and 11.8). EPDS studies with optimal cutoffs below ≥ 10 did not report more cutoffs below 10 but those with optimal cutoffs above ≥ 10 reported more cutoffs above 10 (mean cutoffs: 9.9 and 11.8).

Conclusions: Selective cutoff reporting and resulting bias in accuracy estimates were more pronounced for the PHQ-9 than for the EPDS. Researchers evaluating diagnostic accuracy of screening tools should report results for all relevant cutoffs.

Patient or healthcare consumer involvement: There was no patient or healthcare consumer involvement in the present study.