Should diagnostic search filters be used in systematic reviews?

2004 Ottawa

Leeflang M, Scholten R, Reitsma H, Rutjes A, Bossuyt P

Background: To identify diagnostic accuracy studies in an electronic database, such as Medline, several systematic search strategies can be used. These strategies consist often of free text words and MeSH headings directed to disease indicators in combination with a so called diagnostic filter, consisting of free text words and MeSH headings directed to diagnostic indicators. It is well known that indexing of original studies is not perfect and filters are known to differ in sensitivity (percentage correctly identified studies) and specificity (percentage correctly not identified studies). Some of these diagnostic search filters have been validated for use in a wide range of diagnostic fields. However, comparative evaluations of these filters for diagnostic accuracy studies have been limited so far.

Methods: We applied eight published and validated diagnostic search filters to 27 systematic reviews regarding diagnostic test accuracy. These reviews were part of a large overview of 191 systematic reviews published from 1999 to 2002. From each review we identified the studies that had been included (reference set). We verified how many studies did and how many did not pass the filters.

Results: On average, the most sensitive search filters failed to identify 7% of the included studies (median 2.5%, range 0-49%), whereas the specific filters missed 41% (median 40.5%, range 5-90%). When applied to studies published before 1985 the sensitivity of the various filters decreased even further.

Conclusions: We conclude that the use of methodological filters to identify diagnostic accuracy studies can lead to failure of identifying a considerable number of relevant studies. Therefore, when preparing a systematic review, it may be preferable not to use these methodological filters and to apply search strategies that use only search terms pertaining to the target condition of the patients involved (disease characteristics) and the index test. We will further study the consequences of this approach with respect to the number of irrelevant hits (false-positives).