Article type
Abstract
Background
Machine learning (ML)-related publications have surged to address unmet clinical needs for clinical utility. Regulatory approval of digital therapeutics has validated the promising role of ML in the clinical care pathway. However, a few concerns have been raised about the scientific rigor, suitability of sampling, racial/ethnicity bias, transparency, interpretability, and reproducibility of these ML models to formally incorporate in clinical practice guidelines. There is an unmet need to increase the literacy of understanding of these ML models among clinicians to ensure the appropriate incorporation of these models in evidence-based practice.
Objectives: To 1) conduct a literature review to identify the current reporting guidelines and existing guidance for critical appraisal of ML papers and 2) develop an easy-to-understand, robust critical appraisal tool for ML papers
Methods
We surveyed the literature for current reporting guidelines for ML papers and identified common ML techniques and practices for clinically appropriate and effective ML development, risks of bias, and design flaws to identify the features critical to appraise ML papers. A pilot critical appraisal checklist was developed to assess the typical validity, risk of bias, clinical utility, and external validity of the proposed ML models. We reviewed with ML experts and incorporated their comments and feedback.
Results
Preliminary data have identified several initial components, each with a set of quality-appraisal questions through a scoping literature review and Delphi discussion with experts. A primer of detailed explanation of each quality-assessment question along with interpretation and related examples was also developed for a detailed guidance for critical appraisal of ML papers. Beta testing with users in a real-life scenario provided supportive feedback on the usefulness and an ease-of-use assessment of the newly developed critical appraisal tool for ML papers.
Conclusion: We developed a critical appraisal (CA) tool for ML papers with acceptable feedback from end users. The tool will provide the much needed knowledge and guidance to clinicians to assess the quality of ML papers with confidence to bridge the gap between clinicians, healthcare scientists, and ML engineers to address any potential flaws of published ML papers.
Machine learning (ML)-related publications have surged to address unmet clinical needs for clinical utility. Regulatory approval of digital therapeutics has validated the promising role of ML in the clinical care pathway. However, a few concerns have been raised about the scientific rigor, suitability of sampling, racial/ethnicity bias, transparency, interpretability, and reproducibility of these ML models to formally incorporate in clinical practice guidelines. There is an unmet need to increase the literacy of understanding of these ML models among clinicians to ensure the appropriate incorporation of these models in evidence-based practice.
Objectives: To 1) conduct a literature review to identify the current reporting guidelines and existing guidance for critical appraisal of ML papers and 2) develop an easy-to-understand, robust critical appraisal tool for ML papers
Methods
We surveyed the literature for current reporting guidelines for ML papers and identified common ML techniques and practices for clinically appropriate and effective ML development, risks of bias, and design flaws to identify the features critical to appraise ML papers. A pilot critical appraisal checklist was developed to assess the typical validity, risk of bias, clinical utility, and external validity of the proposed ML models. We reviewed with ML experts and incorporated their comments and feedback.
Results
Preliminary data have identified several initial components, each with a set of quality-appraisal questions through a scoping literature review and Delphi discussion with experts. A primer of detailed explanation of each quality-assessment question along with interpretation and related examples was also developed for a detailed guidance for critical appraisal of ML papers. Beta testing with users in a real-life scenario provided supportive feedback on the usefulness and an ease-of-use assessment of the newly developed critical appraisal tool for ML papers.
Conclusion: We developed a critical appraisal (CA) tool for ML papers with acceptable feedback from end users. The tool will provide the much needed knowledge and guidance to clinicians to assess the quality of ML papers with confidence to bridge the gap between clinicians, healthcare scientists, and ML engineers to address any potential flaws of published ML papers.