Designing an Efficient and Precise Search Strategy for Observational Studies

Article type
Authors
Wieland S, Brodney S, Dickersin K
Abstract
Objective: To describe the construction, sensitivity and precision of a maximally sensitive MEDLINE search for observational studies of a relationship between an exposure and disease. Productive search design strategies have been developed for identifying randomized controlled trials for systematic reviews, but precise searches for identification of relevant studies for reviews of observational studies have not been as widely documented.

Methods: Beginning with a gold standard of all relevant observational studies on breast cancer and oral contraceptives, we used an iterative process to construct a maximally sensitive MEDLINE search strategy. The gold standard comprised 58 observational studies published between 1966 and 1995 in MEDLINE-indexed journals that were cited in a 1996 systematic review of oral contraceptives and breast cancer. We examined the MEDLINE record of each study for abstract and title words and Medical Subject Heading (MeSH) for exposure, outcome, and methodological terms. We designed a search using the minimum number of text words to retrieve all records, and ran a text word search of MEDLINE in PubMed with and without "automatic term mapping" to MeSH. We then designed a search using the minimum number of MeSH terms to retrieve all records, and ran a MeSH search of MEDLINE in PubMed with and without the automatic inclusion of more specific terms. The main outcomes were sensitivity (the proportion of the observational studies in the gold standard identified by the search) and precision (proportion of publications retrieved by the MEDLINE search that were also included in the gold standard).

Results: All 58 records were indexed under MeSH terms human [mh] and publication type [pt] journal article in MEDLINE. One of the 58 records was not indexed under MeSH female [mh]. We thus limited all subsequent MEDLINE searches to human [mh] and journal article [pt]. Ten of the 58 records (17.2%) did not contain text or MeSH terms related to oral contraceptives, estrogens, or hormone use. A text word search without "automatic term mapping" retrieved 2,197 records (precision=2.2%, [48/2197]), which included 48 of the 58 gold standard records (sensitivity=83.0%, [48/58]). A text word search with "automatic term mapping" retrieved 2,920 records (precision=1.6%, [48/2920]), which included the same 48 records (sensitivity=83.0%, [48/58]). A MeSH search, which allowed explosion of MeSH terms, retrieved 795 records (precision=6.0%, [48/795]), again identifying the same 48 records (sensitivity=83.0%, [48/58]). Finally, a search using major MeSH terms without the automatic inclusion of more specific terms, retrieved 364 records (precision=13.2%, [48/364]), but included only 28 of the 58 records (sensitivity=48.3%, [28/58]).

Conclusions: A MeSH search with automatic inclusion of more specific MeSH terms is both the most precise and maximally sensitive. Further refinement of the MeSH search is warranted, followed by testing in other exposure and outcome areas that may allow development of generalizable search strategies for observational studies.