Article type
Abstract
Background: Current methods for evaluating search strategies and automated citation screening in systematic literature reviews mainly focus on the binary relevance of publications, without considering the varying influence of individual studies on the review's outcomes. This practice overlooks the fact that not all included studies contribute equally to the systematic review's conclusions, potentially misleading the assessment of search strategies' effectiveness (Figure 1).
Objectives: This study aims to introduce a new evaluation framework for search strategies in systematic reviews that accounts for the differential impact of studies on the review's overall outcome (Figure 2). By recognising the varying influence of included studies, the framework seeks to provide a more nuanced assessment of search strategies, enhancing the precision and relevance of retrieved studies.
Methods: We applied the proposed evaluation framework to a dataset from the CLEF Technologically Assisted Reviews 2019 shared task [1], involving systematic reviews of interventions. Our framework uses review meta-analysis data and outcome effect estimates, derived from rankings of systematic review citations, to evaluate the impact of including or excluding studies of differing influence. We assessed 74 automation models using this framework and compared the results to those obtained through traditional evaluation metrics.
Results: The analysis revealed that incorporating the differential impact of studies into the evaluation framework led to a distinct assessment of search strategies, diverging from conclusions drawn through conventional metrics (Table 1). Specifically, the framework highlighted the importance of retrieving high-impact studies over merely maximising the recall of relevant publications, thereby offering insights into the effectiveness of search strategies from a more outcome-oriented perspective.
Conclusions: Our proposed evaluation framework represents a significant advancement in the methodology for assessing search strategies and automation methods in systematic reviews. By accounting for the varying influence of studies on the review's outcome, it enables a more accurate and meaningful evaluation of search strategies' effectiveness. This approach not only benefits researchers in identifying the most influential studies but also contributes to more robust evidence production, ultimately enhancing patient care and informing clinical practice through more precise and relevant systematic reviews.
[1] https://github.com/CLEF-TAR/tar
Objectives: This study aims to introduce a new evaluation framework for search strategies in systematic reviews that accounts for the differential impact of studies on the review's overall outcome (Figure 2). By recognising the varying influence of included studies, the framework seeks to provide a more nuanced assessment of search strategies, enhancing the precision and relevance of retrieved studies.
Methods: We applied the proposed evaluation framework to a dataset from the CLEF Technologically Assisted Reviews 2019 shared task [1], involving systematic reviews of interventions. Our framework uses review meta-analysis data and outcome effect estimates, derived from rankings of systematic review citations, to evaluate the impact of including or excluding studies of differing influence. We assessed 74 automation models using this framework and compared the results to those obtained through traditional evaluation metrics.
Results: The analysis revealed that incorporating the differential impact of studies into the evaluation framework led to a distinct assessment of search strategies, diverging from conclusions drawn through conventional metrics (Table 1). Specifically, the framework highlighted the importance of retrieving high-impact studies over merely maximising the recall of relevant publications, thereby offering insights into the effectiveness of search strategies from a more outcome-oriented perspective.
Conclusions: Our proposed evaluation framework represents a significant advancement in the methodology for assessing search strategies and automation methods in systematic reviews. By accounting for the varying influence of studies on the review's outcome, it enables a more accurate and meaningful evaluation of search strategies' effectiveness. This approach not only benefits researchers in identifying the most influential studies but also contributes to more robust evidence production, ultimately enhancing patient care and informing clinical practice through more precise and relevant systematic reviews.
[1] https://github.com/CLEF-TAR/tar