scHyena: Foundation Model for Full-Length Single-Cell RNA-Seq Analysis in Brain

Oh, Gyutaek, Choi, Baekgyu, Jung, Inkyung, Ye, Jong Chul

arXiv.org Artificial Intelligence 

Single-cell RNA sequencing (scRNA-seq) has made significant strides in unraveling the intricate cellular diversity within complex tissues. This is particularly critical in the brain, presenting a greater diversity of cell types than other tissue types, to gain a deeper understanding of brain function within various cellular contexts. However, analyzing scRNA-seq data remains a challenge due to inherent measurement noise stemming from dropout events and the limited utilization of extensive gene expression information. In this work, we introduce scHyena, a foundation model designed to address these challenges and enhance the accuracy of scRNAseq analysis in the brain. Specifically, inspired by the recent Hyena operator, we design a novel Transformer architecture called singe-cell Hyena (scHyena) that is equipped with a linear adaptor layer, the positional encoding via gene-embedding, and a bidirectional Hyena operator. This enables us to process full-length scRNAseq data without losing any information from the raw data. In particular, our model learns generalizable features of cells and genes through pre-training scHyena using the full length of scRNA-seq data. We demonstrate the superior performance of scHyena compared to other benchmark methods in downstream tasks, including cell type classification and scRNA-seq imputation. Single-cell RNA sequencing (scRNA-seq) is a powerful technique for profiling gene expression levels at single-cell resolution, enabling molecular characteristics of complex biological systems in both normal and disease states (Saliba et al., 2014; Rood et al., 2022). Through scRNA-seq, several key objectives can be achieved, including cell type annotation (Li et al., 2020; Hao et al., 2021), the discovery of novel cell types (Villani et al., 2017), the identification of marker genes (Jaitin et al., 2014), and the analysis of cellular heterogeneity (Papalexi & Satija, 2018; Kinker et al., 2020). It is worth noting that the brain exhibits a particularly diverse range of cell types compared to other tissues (Saunders et al., 2018; Hodge et al., 2019). Therefore, conducting scRNA-seq analysis in the brain is especially important to gain a deeper understanding of brain function within various cellular contexts.

Duplicate Docs Excel Report

Title
None found

Similar Docs  Excel Report  more

TitleSimilaritySource
None found