Diagnostic test risk of bias and applicability: how detailed and explicit should criteria be?

Article type
Authors
Harris J1, Robbins CW1
1The Permanente Federation, United States
Abstract
Background: Studies of diagnostic tests have been prone to errors of design and execution that make result in bias and reduced applicability and feasibility. Problems include proper selection and recruitment of study populations, selection of clinically relevant normal ranges, identification and use of gold standards, proper sequencing of tests and treatments, and many issues of applicability. These possible pitfalls argue for explicit and comprehensive criteria for test evaluation.

Objectives: 1. To compare criteria sets used to assess the accuracy and applicability of diagnostic tests for logic and comprehensiveness. 2. To suggest a complete and transparent set of criteria for risk of bias and clinical applicability.

Methods: We compared the criteria in the Cochrane Manual, GRADE publications, QUADAS 2, STARD, and reference texts and key articles to assess parallels and gaps in criteria suggested to assess the accuracy and applicability of diagnostic tests.

Results: Reference texts, in agreement with clinical logic and quality criteria, suggested about 30 areas that were not explicitly addressed in one or more of QUADAS, STARD or Cochrane. GRADE covered many but not all of these areas. Examples include disease diagnostic criteria, recruitment, sample characteristics, setting, test conduct, intercurrent treatment, clinical sequencing, reference standard interpretation, inter-observer variation, precision and reliability, test statistics, adverse effects, spectrum bias, verification bias, work up bias, imprecision, acceptability and outcome improvement.

Conclusion: Assessment of the comparative accuracy and applicability of diagnostic tests presents a challenge to reviewers. The more complete and explicit the evaluation framework, the greater the likelihood that bias will be identified in reviews, and that applicability will be clearly identified. Such assessments increase trust and utilization by clinicians and policy makers. I would like to encourage audience discussion of the suggested criteria set.