scRDiT: Generating single-cell RNA-seq data by diffusion transformers and accelerating sampling
Dong, Shengze, Cui, Zhuorui, Liu, Ding, Lei, Jinzhi
–arXiv.org Artificial Intelligence
Motivation: Single-cell RNA sequencing (scRNA-seq) is a groundbreaking technology extensively utilized in biological research, facilitating the examination of gene expression at the individual cell level within a given tissue sample. While numerous tools have been developed for scRNA-seq data analysis, the challenge persists in capturing the distinct features of such data and replicating virtual datasets that share analogous statistical properties. Results: Our study introduces a generative approach termed scRNA-seq Diffusion Transformer (scRDiT). This method generates virtual scRNA-seq data by leveraging a real dataset. The method is a neural network constructed based on Denoising Diffusion Probabilistic Models (DDPMs) and Diffusion Transformers (DiTs). This involves subjecting Gaussian noises to the real dataset through iterative noise-adding steps and ultimately restoring the noises to form scRNA-seq samples. This scheme allows us to learn data features from actual scRNA-seq samples during model training. Our experiments, conducted on two distinct scRNA-seq datasets, demonstrate superior performance. Additionally, the model sampling process is expedited by incorporating Denoising Diffusion Implicit Models (DDIM). scRDiT presents a unified methodology empowering users to train neural network models with their unique scRNA-seq datasets, enabling the generation of numerous high-quality scRNA-seq samples. Availability and implementation: https://github.com/DongShengze/scRDiT
arXiv.org Artificial Intelligence
Apr-9-2024
- Country:
- Asia
- China > Tianjin Province
- Tianjin (0.05)
- Middle East > Jordan (0.04)
- China > Tianjin Province
- Europe > Germany
- Bavaria > Upper Bavaria > Munich (0.04)
- North America > United States (0.04)
- Asia
- Genre:
- Overview > Innovation (0.34)
- Research Report (1.00)
- Industry:
- Technology: