Design related bias and sources of variation in diagnostic accuracy studies

Article type
Authors
Rutjes A, Reitsma J, Di Nisio M, Smidt N, van Rijn J, Bossuyt P
Abstract
Background: Previous research has provided empirical evidence that diagnostic studies with methodological shortcomings overestimate the accuracy of a diagnostic test.1 Solid evidence on the impact of design choices on accuracy is needed to guide researchers, reviewers and decision-makers in the design of studies and the appraisal of study findings. We extended previous research by examining more sources of bias and variation within a larger, more recent set of systematic reviews.

Purpose: To determine the relative importance of various sources of bias and variation in diagnostic accuracy studies.

Study selection: We searched for recently published systematic reviews summarizing the accuracy of diagnostic tests in Medline, Embase, Dare and Medion between 1999 and 2002. Reviews were eligible if they: a) included at least 10 original diagnostic accuracy studies; b) did not use design features as a inclusion criteria c) presented either pooled estimates or individual estimates of sensitivity and specificity of included studies. Two assessors independently assessed inclusion.

Data extraction: Two reviewers independently assessed the individual studies within these meta-analyses for several design and reporting characteristics, and extracted the raw data of the 2 by 2 table.

Data synthesis: We used a multivariable meta-regression model to investigate the association between estimates of diagnostic accuracy and 13 design characteristics. The main outcome measure was the relative diagnostic odds ratio (RDOR): the ratio between the average odds ratio in studies with certain methodological shortcomings and the average odds ratio in studies without these "flaws".

Results: The final dataset consisted of 487 individual studies from 31 different meta-analyses. Incomplete reporting formed a major problem; the proportion of studies in which information was lacking varied between 13% for the item "no definition index test" and 88% for "training of assessors". The model identified four sources of bias significantly related to overestimated estimates of accuracy: differential verification (RDOR of 1.6); non-consecutive inclusion of patients (RDOR of 1.4); not pre-specified cut-offs to define positive and negative index test results (RDOR of 1.4); and retrospective data collection (RDOR of 1.3). The strongest association with overestimation of accuracy was found for two sample (case control) designs that included severe cases and healthy controls (RDOR 1.8), but this association was not significant. One design characteristic was associated with an underestimating effect. Recruitment based on index test was associated with a substantial lower estimate of diagnostic accuracy (RDOR 0.4). Conclusions: Several design related characteristics could not properly be evaluated due to incomplete reporting. Our study confirms that shortcomings in design can lead to overoptimistic results in diagnostic accuracy studies, in particular for severe case/healthy control design and differential verification. The current study adds empirical evidence that pre-selection of patients is a major source of variation, threatening external validity. The study further adds that non-consecutive sampling; data driven choices of cut-off; and retrospective data collection are sources of bias.

Recommendations: There is a strong need to improve both the methodological quality and the quality of reporting of diagnostic studies. Researchers, physicians and other health care professionals should be aware that choices in design can influence the outcome.

References: 1. Lijmer JG, Mol BW, Heisterkamp S, Bonsel GJ, Prins MH, van der Meulen JH, Bossuyt PM. Empirical evidence of design-related bias in studies of diagnostic tests. JAMA. 1999. 282(11):1061-6.