Goto

Collaborating Authors

 Li, Xinyuan


Graph Structure Learning for Tumor Microenvironment with Cell Type Annotation from non-spatial scRNA-seq data

arXiv.org Artificial Intelligence

The exploration of cellular heterogeneity within the tumor microenvironment (TME) via single-cell RNA sequencing (scRNA-seq) is essential for understanding cancer progression and response to therapy. Current scRNA-seq approaches, however, lack spatial context and rely on incomplete datasets of ligand-receptor interactions (LRIs), limiting accurate cell type annotation and cell-cell communication (CCC) inference. This study addresses these challenges using a novel graph neural network (GNN) model that enhances cell type prediction and cell interaction analysis. Our study utilized a dataset consisting of 49,020 cells from 19 patients across three cancer types: Leukemia, Breast Invasive Carcinoma, and Colorectal Cancer. The proposed scGSL model demonstrated robust performance, achieving an average accuracy of 84.83%, precision of 86.23%, recall of 81.51%, and an F1 score of 80.92% across all datasets. These metrics represent a significant enhancement over existing methods, which typically exhibit lower performance metrics. Additionally, by reviewing existing literature on gene interactions within the TME, the scGSL model proves to robustly identify biologically meaningful gene interactions in an unsupervised manner, validated by significant expression differences in key gene pairs across various cancers. The source code and data used in this paper can be found in https://github.com/LiYuechao1998/scGSL.


scBIT: Integrating Single-cell Transcriptomic Data into fMRI-based Prediction for Alzheimer's Disease Diagnosis

arXiv.org Artificial Intelligence

Functional MRI (fMRI) and single-cell transcriptomics are pivotal in Alzheimer's disease (AD) research, each providing unique insights into neural function and molecular mechanisms. However, integrating these complementary modalities remains largely unexplored. Here, we introduce scBIT, a novel method for enhancing AD prediction by combining fMRI with single-nucleus RNA (snRNA). scBIT leverages snRNA as an auxiliary modality, significantly improving fMRI-based prediction models and providing comprehensive interpretability. It employs a sampling strategy to segment snRNA data into cell-type-specific gene networks and utilizes a self-explainable graph neural network to extract critical subgraphs. Additionally, we use demographic and genetic similarities to pair snRNA and fMRI data across individuals, enabling robust cross-modal learning. Extensive experiments validate scBIT's effectiveness in revealing intricate brain region-gene associations and enhancing diagnostic prediction accuracy. By advancing brain imaging transcriptomics to the single-cell level, scBIT sheds new light on biomarker discovery in AD research. Experimental results show that incorporating snRNA data into the scBIT model significantly boosts accuracy, improving binary classification by 3.39% and five-class classification by 26.59%. The codes were implemented in Python and have been released on GitHub (https://github.com/77YQ77/scBIT) and Zenodo (https://zenodo.org/records/11599030) with detailed instructions.


DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning

arXiv.org Artificial Intelligence

We introduce our first-generation reasoning models, DeepSeek-R1-Zero and DeepSeek-R1. DeepSeek-R1-Zero, a model trained via large-scale reinforcement learning (RL) without supervised fine-tuning (SFT) as a preliminary step, demonstrates remarkable reasoning capabilities. Through RL, DeepSeek-R1-Zero naturally emerges with numerous powerful and intriguing reasoning behaviors. However, it encounters challenges such as poor readability, and language mixing. To address these issues and further enhance reasoning performance, we introduce DeepSeek-R1, which incorporates multi-stage training and cold-start data before RL. DeepSeek-R1 achieves performance comparable to OpenAI-o1-1217 on reasoning tasks. To support the research community, we open-source DeepSeek-R1-Zero, DeepSeek-R1, and six dense models (1.5B, 7B, 8B, 14B, 32B, 70B) distilled from DeepSeek-R1 based on Qwen and Llama.


DeepSeek-V3 Technical Report

arXiv.org Artificial Intelligence

We present DeepSeek-V3, a strong Mixture-of-Experts (MoE) language model with 671B total parameters with 37B activated for each token. To achieve efficient inference and cost-effective training, DeepSeek-V3 adopts Multi-head Latent Attention (MLA) and DeepSeekMoE architectures, which were thoroughly validated in DeepSeek-V2. Furthermore, DeepSeek-V3 pioneers an auxiliary-loss-free strategy for load balancing and sets a multi-token prediction training objective for stronger performance. We pre-train DeepSeek-V3 on 14.8 trillion diverse and high-quality tokens, followed by Supervised Fine-Tuning and Reinforcement Learning stages to fully harness its capabilities. Comprehensive evaluations reveal that DeepSeek-V3 outperforms other open-source models and achieves performance comparable to leading closed-source models. Despite its excellent performance, DeepSeek-V3 requires only 2.788M H800 GPU hours for its full training. In addition, its training process is remarkably stable. Throughout the entire training process, we did not experience any irrecoverable loss spikes or perform any rollbacks.


Survey of Natural Language Processing for Education: Taxonomy, Systematic Review, and Future Trends

arXiv.org Artificial Intelligence

Natural Language Processing (NLP) aims to analyze the text via techniques in the computer science field. It serves the applications in healthcare, commerce, and education domains. Particularly, NLP has been applied to the education domain to help teaching and learning. In this survey, we review recent advances in NLP with a focus on solving problems related to the education domain. In detail, we begin with introducing the relevant background. Then, we present the taxonomy of NLP in the education domain. Next, we illustrate the task definition, challenges, and corresponding techniques based on the above taxonomy. After that, we showcase some off-the-shelf demonstrations in this domain and conclude with future directions.