Article type
Year
Abstract
Background:
Genetic variants affecting splicing play a fundamental role in disease pathogenicity. Prediction of whether a genetic variant will affect splicing is difficult; many in silico tools exist which require adjustment for accurate splice prediction. Best practise guidelines often do not exist and different tools can provide confounding results. New high-throughput next-generation sequencing has increased biological target capture of potential splice sites. Experimental validation is required to characterise any variants in the splice region. The volume of this data however is vast; validation is slow, costly and non-viable at scale. Computational tools offer a method to filter results to an actionable quota suitable for experimental follow-up. Prediction of whether variation will affect splicing is challenging; successful tools accelerate diagnosis and aid prioritisation of variants of unknown significance with high accuracy and reliability.
Objectives:
Determine the effectiveness of eligible Splicing Analysis and Prediction Tools (SAPT) and where possible rank them alongside providing best practice in their use whilst accounting for quality during the appraisal of eligible tools.
Methods:
This study systematically reviewed the literature ranging from 1st January 1980 to 21st October 2019 on SAPT. Statistical measures of specificity, sensitivity and/or accuracy were extracted to provide a hierarchical ranking of tools efficacy and recommendations for best use to aid researchers and clinicians to prioritise experimental follow-up. ‘Synthesis Without Meta-analysis’ (SWiM) PRISMA-DTA guidance shaped the review framework. Manual Pearl Gathering and PRISMA methods were followed for database searching. The CHARMS checklist provided qualitative assessment rigour. Quantitative analysis of eligible papers weighted SAPT in order preference. Idea Webbing and Triangulation were applied to complete analysis.
Results:
Across the subgroups core SAPT: MES, HSF, NNS and SSF-like had high-performance >85% accuracy. Combination tools emerged with superior performance with four exceeding >95% accuracy: SPiCE, HSF+SSF-like, HSF+SSF-like+MES, SPIDEX. Established SAPT: dbscSNA, PSSM and CADD alongside SpliceAI reported high performance. Innovative study design within MMS and IntSplice reported adequate performance 70-85% accuracy standalone.
Conclusions:
Evidence was robust with minimal bias across the studies. Improvements are required in the literature when reporting the delineation of thresholds. Common themes extracted: Effective tools performed best on large curated datasets with separation of candidate predictors, determined in statistical manner without human selection, utilising both positive and negative datasets. Highly targeted, small window
Genetic variants affecting splicing play a fundamental role in disease pathogenicity. Prediction of whether a genetic variant will affect splicing is difficult; many in silico tools exist which require adjustment for accurate splice prediction. Best practise guidelines often do not exist and different tools can provide confounding results. New high-throughput next-generation sequencing has increased biological target capture of potential splice sites. Experimental validation is required to characterise any variants in the splice region. The volume of this data however is vast; validation is slow, costly and non-viable at scale. Computational tools offer a method to filter results to an actionable quota suitable for experimental follow-up. Prediction of whether variation will affect splicing is challenging; successful tools accelerate diagnosis and aid prioritisation of variants of unknown significance with high accuracy and reliability.
Objectives:
Determine the effectiveness of eligible Splicing Analysis and Prediction Tools (SAPT) and where possible rank them alongside providing best practice in their use whilst accounting for quality during the appraisal of eligible tools.
Methods:
This study systematically reviewed the literature ranging from 1st January 1980 to 21st October 2019 on SAPT. Statistical measures of specificity, sensitivity and/or accuracy were extracted to provide a hierarchical ranking of tools efficacy and recommendations for best use to aid researchers and clinicians to prioritise experimental follow-up. ‘Synthesis Without Meta-analysis’ (SWiM) PRISMA-DTA guidance shaped the review framework. Manual Pearl Gathering and PRISMA methods were followed for database searching. The CHARMS checklist provided qualitative assessment rigour. Quantitative analysis of eligible papers weighted SAPT in order preference. Idea Webbing and Triangulation were applied to complete analysis.
Results:
Across the subgroups core SAPT: MES, HSF, NNS and SSF-like had high-performance >85% accuracy. Combination tools emerged with superior performance with four exceeding >95% accuracy: SPiCE, HSF+SSF-like, HSF+SSF-like+MES, SPIDEX. Established SAPT: dbscSNA, PSSM and CADD alongside SpliceAI reported high performance. Innovative study design within MMS and IntSplice reported adequate performance 70-85% accuracy standalone.
Conclusions:
Evidence was robust with minimal bias across the studies. Improvements are required in the literature when reporting the delineation of thresholds. Common themes extracted: Effective tools performed best on large curated datasets with separation of candidate predictors, determined in statistical manner without human selection, utilising both positive and negative datasets. Highly targeted, small window