ReCellTy: Domain-specific knowledge graph retrieval-augmented LLMs workflow for single-cell annotation
Han, Dezheng, Jia, Yibin, Chen, Ruxiao, Han, Wenjie, Guo, Shuaishuai, Wang, Jianbo
–arXiv.org Artificial Intelligence
These authors contributed equally to this work. Abstract To enable precise and fully automated cell type annotation with large language models (LLMs), we developed a graph-structured feature-marker database to retrieve entities linked to differential genes for cell reconstruction. We further designed a multi-task workflow to optimize the annotation process. Compared to general-purpose LLMs, our method improves human evaluation scores by up to 0.21 and semantic similarity by 6.1% across 11 tissue types, while more closely aligning with the cognitive logic of manual annotation. Keywords: Cell type annotation, Graph RAG, Large language models, Graph data curation, Multi-task workflow, scRNA-seq In single-cell RNA sequencing analysis, achieving precise cell type annotation through manual labeling typically requires two key steps: annotators retrieve relevant marker genes and integrate this information with their domain expertise to make informed decisions.
arXiv.org Artificial Intelligence
May-2-2025
- Country:
- Asia > China (0.05)
- North America > United States (0.04)
- Genre:
- Research Report > New Finding (0.47)
- Industry:
- Technology: