Identifying randomized controlled trials in conference proceedings abstracts

1998 Baltimore

Hersh W, Price S

Introduction/Objective: While randomized controlled trials (RCTs) in the MEDLINE database are readily identifiable with the publication type field, trials not in MEDLINE, particularly those in conference proceedings, are more difficult to identify. Such trials, however, are important to identify, since they are candidates for use in systematic reviews. The objective of this study was to identify search strategies for identifying RCTs from conference proceedings abstracts.

Methods: We used a database of conference proceedings that has been coded for RCT status and thus serves as a gold standard: the subset of the AIDSLINE database with citations from the International Conference on AIDS. All citations from this subset inclusive of the years 1991-1996 were obtained from the US National Library of Medicine. All but the title and abstract of each record was stripped out and the resulting database was loaded into a search engine. Recall (sensitivity) and precision (positive predictive value) were measured for each strategy. A variety of strategies to identify RCTs were used, based on those known to be effective in past investigations (e.g., Haynes et. al., JAMA, 1: 447-458, 1994). We also aimed to find strategies that achieved 100% recall, since this is often the goal of searches performed for systematic reviews.

Results: The subset contained a total of 21,575 citations, of which 345 were coded as an RCT. Only 274 (79.4%) of these were likely RCTs. As with most searching strategies, there was a trade-off between recall and precision:

random OR randomise OR placebo OR (double AND blind) OR (comparative AND trial) OR efficacy) had recall of 1 and precision of 0.1

random OR randomise OR placebo OR (double AND blind) OR (comparative AND trial) had recall of 0.996 and precision of 0.152

random OR randomise OR placebo OR (comparative AND trial) OR efficacy had recall of 0.996 and precision of 0.152

(random OR randomise OR placebo) AND ((double AND blind) OR (control AND (study OR trial)) OR efficacy OR crossover OR compare had recall of 0.898 and precision of 0.266

(random OR randomise) AND (placebo OR (double AND blind) OR (controlled AND trial) OR (controlled AND study) OR efficacy) had recall of 0.785 and precision of 0.342 (NOTE: all words were stemmed with an American-English stemmer, e.g., randomized to random). A review of 240 citations retrieved that were not classified as RCTs showed only one to be an RCT, indicating that indexers were much more likely to assign the RCT code for a non-RCT than they were to not assign the RCT code when it was warranted. We also noted qualitatively that a number of authors called their studies "controlled" when they were indeed not, and that many abstracts were written so poorly as to make judgement of the study being an RCT difficult.

Discussion: Effective strategies for retrieving RCTs from one conference proceedings database can be developed to yield very high recall though at a price of low precision. Additional work is required by the searcher to discern whether the studies reported are truly RCTs. Nonetheless, as more conference proceedings are published in electronic form, they may be a source of data for systematic reviews. Additional research must be performed with additional strategies and databases.