Article type
Abstract
Background: The rise of extensive bibliographic databases like OpenAlex and Semantic Scholar offers the potential for automated citation searching as a search strategy for systematic reviews, promising increased efficiency, cost savings, and more robust and replicable evidence synthesis. However, there has been limited investigation into how best to integrate such techniques into current systematic review production workflows.
Objectives: To simulate automated citation searching on systematic review topics across different study areas, assess the effectiveness of automated citation searching compared to traditional search strategies, and examine factors that influence performance.
Methods: Automated citation searching was simulated on 27 systematic reviews sampled from the public health literature, social policy literature, and environmental management literature across both the OpenAlex and Semantic Scholar databases. Performance as measured by recall (proportion of relevant articles identified), precision (proportion of relevant articles identified from all articles identified), and F(1-3) scores (weighted scores incorporating recall and precision) was compared to the performance of the original search strategies employed by each systematic review. The associations between systematic review topic area, number of included articles, number of seed articles, seed article type, study type inclusion criteria, database choice, and resulting performance were analyzed.
Results: Automated citation searching outperformed the reference standard in terms of precision (P<0.05) and F1 score (P<0.05) but failed to surpass the reference standard in terms of recall (P<0.05) and F3 score (P<0.05). Study areas influenced the performance of automated citation searching, with performance being significantly higher in systematic reviews within the field of environmental management compared to social policy.
Conclusion: Results suggest that automated citation searching is best used as a supplementary search strategy in standard systematic reviews for which recall is more important than precision because of inferior recall and F3 score. However, observed outperformance in terms of F1 score and precision suggests that automated citation searching could be helpful in contexts for which precision is as important as recall, for example in rapid reviews. Further iteration of this approach is underway to integrate this method in concert with other automated evidence-retrieval methods.
Objectives: To simulate automated citation searching on systematic review topics across different study areas, assess the effectiveness of automated citation searching compared to traditional search strategies, and examine factors that influence performance.
Methods: Automated citation searching was simulated on 27 systematic reviews sampled from the public health literature, social policy literature, and environmental management literature across both the OpenAlex and Semantic Scholar databases. Performance as measured by recall (proportion of relevant articles identified), precision (proportion of relevant articles identified from all articles identified), and F(1-3) scores (weighted scores incorporating recall and precision) was compared to the performance of the original search strategies employed by each systematic review. The associations between systematic review topic area, number of included articles, number of seed articles, seed article type, study type inclusion criteria, database choice, and resulting performance were analyzed.
Results: Automated citation searching outperformed the reference standard in terms of precision (P<0.05) and F1 score (P<0.05) but failed to surpass the reference standard in terms of recall (P<0.05) and F3 score (P<0.05). Study areas influenced the performance of automated citation searching, with performance being significantly higher in systematic reviews within the field of environmental management compared to social policy.
Conclusion: Results suggest that automated citation searching is best used as a supplementary search strategy in standard systematic reviews for which recall is more important than precision because of inferior recall and F3 score. However, observed outperformance in terms of F1 score and precision suggests that automated citation searching could be helpful in contexts for which precision is as important as recall, for example in rapid reviews. Further iteration of this approach is underway to integrate this method in concert with other automated evidence-retrieval methods.