Article type
Year
Abstract
Background: Given the difficulty in navigating large volumes of information and limited time for searching the literature, clinical practice guidelines (CPGs) are important sources of evidence for healthcare providers. If large bibliographic databases are to be helpful, clinicians must be able to retrieve relevant references, without missing key citations or retrieving excessive numbers of irrelevant reports. Search filters designed for MEDLINE may provide a more efficient way to retrieve CPGs. Search filters can maximise the number of relevant results while minimizing the number of irrelevant ones.
Objectives: To develop validated search filters in Ovid MEDLINE using text mining techniques and measure their performance according to sensitivity and precision. We aim to develop a sensitivity-maximising filter, and a sensitivity-and-precision maximising filter for retrieval of CPGs.
Methods: We derived two samples of CPGs: a “test set” of CPGs (n = 100), and a validation set of CPGs (n = 100). Using the test set, we conducted text mining to determine the frequency of terms (one word, bigrams and trigrams) in the titles, abstracts, ti/ab and full text. Candidate terms were combined iteratively and tested in Ovid MEDLINE. Development of the search filters focused on precision (without compromising sensitivity) as this will help users to cut back on screening time and resources. Using the most frequent terms and MeSH, two researchers developed the strategies independently, then compared, refined, and finalized the optimal strategies for each filter type. If the strategies changed based on the comparisons, precision and sensitivity were recalculated. Transparent instructions were used to create the strategies to increase standardisation of procedures. Finally, we validated our final filters on an external validation set of guideline citations and calculated the sensitivity and precision.
Results: To our knowledge, this is the first study to develop validated search strategies using text mining for identifying CPGs in Ovid MEDLINE. Semi-objective search filters were developed: a sensitivity-maximizing strategy, and a sensitivity-and-precision maximizing strategy to retrieve CPGs. The text mining software enables large amount of n-grams to be sorted by frequency into matrices, allowing for a more objective choice of single and multiple terms used in testing search algorithms. Different text mining applications and software were used to identify key terms and synonyms for guidelines on different topics. The sensitivity-maximising filter should be used when there is need for comprehensiveness and when the filter is appended to search terms for specific conditions or interventions. When the sensitivity-maximising filter is appended to search terms for conditions or interventions, it is unlikely to result in an unacceptably large number of citations to screen.
Conclusions: The search filters enable more efficient identification of CPGs in Ovid MEDLINE.
Patient or healthcare consumer involvement: None
Objectives: To develop validated search filters in Ovid MEDLINE using text mining techniques and measure their performance according to sensitivity and precision. We aim to develop a sensitivity-maximising filter, and a sensitivity-and-precision maximising filter for retrieval of CPGs.
Methods: We derived two samples of CPGs: a “test set” of CPGs (n = 100), and a validation set of CPGs (n = 100). Using the test set, we conducted text mining to determine the frequency of terms (one word, bigrams and trigrams) in the titles, abstracts, ti/ab and full text. Candidate terms were combined iteratively and tested in Ovid MEDLINE. Development of the search filters focused on precision (without compromising sensitivity) as this will help users to cut back on screening time and resources. Using the most frequent terms and MeSH, two researchers developed the strategies independently, then compared, refined, and finalized the optimal strategies for each filter type. If the strategies changed based on the comparisons, precision and sensitivity were recalculated. Transparent instructions were used to create the strategies to increase standardisation of procedures. Finally, we validated our final filters on an external validation set of guideline citations and calculated the sensitivity and precision.
Results: To our knowledge, this is the first study to develop validated search strategies using text mining for identifying CPGs in Ovid MEDLINE. Semi-objective search filters were developed: a sensitivity-maximizing strategy, and a sensitivity-and-precision maximizing strategy to retrieve CPGs. The text mining software enables large amount of n-grams to be sorted by frequency into matrices, allowing for a more objective choice of single and multiple terms used in testing search algorithms. Different text mining applications and software were used to identify key terms and synonyms for guidelines on different topics. The sensitivity-maximising filter should be used when there is need for comprehensiveness and when the filter is appended to search terms for specific conditions or interventions. When the sensitivity-maximising filter is appended to search terms for conditions or interventions, it is unlikely to result in an unacceptably large number of citations to screen.
Conclusions: The search filters enable more efficient identification of CPGs in Ovid MEDLINE.
Patient or healthcare consumer involvement: None