Explainable convolutional neural network model provides an alternative genome-wide association perspective on mutations in SARS-CoV-2
Hatami, Parisa, Annan, Richard, Miranda, Luis Urias, Gorman, Jane, Xie, Mengjun, Qingge, Letu, Qin, Hong
–arXiv.org Artificial Intelligence
Identifying mutations of SARS-CoV-2 strains associated with their phenotypic changes is critical for pandemic prediction and prevention. We compared an explainable convolutional neural network (CNN) approach and the traditional genome-wide association study (GWAS) on the mutations associated with WHO labels of SARS-CoV-2, a proxy for virulence phenotypes. We trained a CNN classification model that can predict genomic sequences into Variants of Concern (VOCs) and then applied Shapley Additive explanations (SHAP) model to identify mutations that are important for the correct predictions. For comparison, we performed traditional GWAS to identify mutations associated with VOCs. Comparison of the two approaches shows that the explainable neural network approach can more effectively reveal known nucleotide substitutions associated with VOCs, such as those in the spike gene regions. Our results suggest that explainable neural networks for genomic sequences offer a promising alternative to the traditional genome wide analysis approaches.
arXiv.org Artificial Intelligence
Dec-31-2024
- Country:
- South America > Brazil
- Amazonas (0.04)
- North America > United States
- Tennessee > Hamilton County
- Chattanooga (0.04)
- North Carolina > Guilford County
- Greensboro (0.04)
- New York > New York County
- New York City (0.04)
- California > Santa Clara County
- Palo Alto (0.04)
- Tennessee > Hamilton County
- Asia
- Middle East > Jordan (0.04)
- China > Hubei Province
- Wuhan (0.04)
- South America > Brazil
- Genre:
- Research Report > New Finding (1.00)
- Industry:
- Technology: