AITopics | Ji, Yuelyu

Collaborating Authors

Ji, Yuelyu

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Memory-Aware and Uncertainty-Guided Retrieval for Multi-Hop Question Answering

Ji, Yuelyu, Meng, Rui, Li, Zhuochun, He, Daqing

arXiv.org Artificial IntelligenceMar-29-2025

Multi-hop question answering (QA) requires models to retrieve and reason over multiple pieces of evidence. While Retrieval-Augmented Generation (RAG) has made progress in this area, existing methods often suffer from two key limitations: (1) fixed or overly frequent retrieval steps, and (2) ineffective use of previously retrieved knowledge. We propose MIND (Memory-Informed and INteractive Dynamic RAG), a framework that addresses these challenges through: (i) prompt-based entity extraction to identify reasoning-relevant elements, (ii) dynamic retrieval triggering based on token-level entropy and attention signals, and (iii) memory-aware filtering, which stores high-confidence facts across reasoning steps to enable consistent multi-hop generation.

large language model, machine learning, question answering, (18 more...)

arXiv.org Artificial Intelligence

2503.23095

Country: North America > United States (0.70)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.94)
Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (0.89)
Information Technology > Artificial Intelligence > Natural Language > Question Answering (0.62)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.51)

Add feedback

Evaluating Bias in Retrieval-Augmented Medical Question-Answering Systems

Ji, Yuelyu, Zhang, Hang, Wang, Yanshan

arXiv.org Artificial IntelligenceMar-19-2025

Medical QA systems powered by Retrieval-Augmented Generation (RAG) models support clinical decision-making but may introduce biases related to race, gender, and social determinants of health. We systematically evaluate biases in RAG-based LLM by examining demographic-sensitive queries and measuring retrieval discrepancies. Using datasets like MMLU and MedMCQA, we analyze retrieval overlap and correctness disparities. Our findings reveal substantial demographic disparities within RAG pipelines, emphasizing the critical need for retrieval methods that explicitly account for fairness to ensure equitable clinical decision-making.

large language model, natural language, retrieval-augmented medical question-answering system, (1 more...)

arXiv.org Artificial Intelligence

2503.15454

Genre: Research Report (0.69)

Industry: Health & Medicine (0.89)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Question Answering (0.85)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.53)

Add feedback

ReasoningRank: Teaching Student Models to Rank through Reasoning-Based Knowledge Distillation

Ji, Yuelyu, Li, Zhuochun, Meng, Rui, He, Daqing

arXiv.org Artificial IntelligenceNov-30-2024

Reranking documents based on their relevance to a given query is a critical task in information retrieval. Traditional reranking methods often lack transparency and rely on proprietary models, hindering reproducibility and interpretability. We propose Reason-to-Rank (R2R), a novel open-source reranking approach that enhances transparency by generating two types of reasoning: direct relevance reasoning, which explains how a document addresses the query, and comparison reasoning, which justifies the relevance of one document over another. We leverage large language models (LLMs) as teacher models to generate these explanations and distill this knowledge into smaller, openly available student models. Our student models are trained to generate meaningful reasoning and rerank documents, achieving competitive performance across multiple datasets, including MSMARCO and BRIGHT. Experiments demonstrate that R2R not only improves reranking accuracy but also provides valuable insights into the decision-making process. By offering a structured and interpretable solution with openly accessible resources, R2R aims to bridge the gap between effectiveness and transparency in information retrieval, fostering reproducibility and further research in the field.

large language model, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2410.05168

Genre: Research Report > Experimental Study (0.93)

Industry:

Health & Medicine (1.00)
Education > Educational Technology > Educational Software (0.83)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.70)

Add feedback

Learning from Committee: Reasoning Distillation from a Mixture of Teachers with Peer-Review

Li, Zhuochun, Ji, Yuelyu, Meng, Rui, He, Daqing

arXiv.org Artificial IntelligenceOct-15-2024

While reasoning capabilities typically emerge in large language models (LLMs) with tens of billions of parameters, recent research focuses on improving smaller open-source models through knowledge distillation (KD) from commercial LLMs. However, many of these studies rely solely on responses from a single LLM as the gold rationale, unlike the natural human learning process, which involves understanding both the correct answers and the reasons behind mistakes. In this paper, we introduce a novel Fault-Aware Distillation via Peer-Review (FAIR) approach: 1) Instead of merely obtaining gold rationales from teachers, our method asks teachers to identify and explain the student's mistakes, providing customized instruction learning data. 2) We design a simulated peer-review process between teacher LLMs, which selects only the generated rationales above the acceptance threshold. This reduces the chance of teachers guessing correctly with flawed rationale, improving instructional data quality. Comprehensive experiments and analysis on mathematical, commonsense, and logical reasoning tasks demonstrate the effectiveness of our method.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2410.03663

Country:

Asia > India (0.14)
Asia > China (0.14)

Genre: Research Report (1.00)

Industry: Education (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.95)

Add feedback

Mitigating the Risk of Health Inequity Exacerbated by Large Language Models

Ji, Yuelyu, Ma, Wenhe, Sivarajkumar, Sonish, Zhang, Hang, Sadhu, Eugene Mathew, Li, Zhuochun, Wu, Xizhi, Visweswaran, Shyam, Wang, Yanshan

arXiv.org Artificial IntelligenceOct-14-2024

Recent advancements in large language models have demonstrated their potential in numerous medical applications, particularly in automating clinical trial matching for translational research and enhancing medical question answering for clinical decision support. However, our study shows that incorporating non decisive sociodemographic factors such as race, sex, income level, LGBT+ status, homelessness, illiteracy, disability, and unemployment into the input of LLMs can lead to incorrect and harmful outputs for these populations. These discrepancies risk exacerbating existing health disparities if LLMs are widely adopted in healthcare. To address this issue, we introduce EquityGuard, a novel framework designed to detect and mitigate the risk of health inequities in LLM based medical applications. Our evaluation demonstrates its efficacy in promoting equitable outcomes across diverse populations.

large language model, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2410.0518

Country: North America > United States (0.29)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (0.91)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (0.70)
Health & Medicine > Therapeutic Area > Endocrinology > Diabetes (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.96)

Add feedback

Transfer Learning with Clinical Concept Embeddings from Large Language Models

Gao, Yuhe, Bao, Runxue, Ji, Yuelyu, Sun, Yiming, Song, Chenxi, Ferraro, Jeffrey P., Ye, Ye

arXiv.org Artificial IntelligenceSep-20-2024

Knowledge sharing is crucial in healthcare, especially when leveraging data from multiple clinical sites to address data scarcity, reduce costs, and enable timely interventions. Transfer learning can facilitate cross-site knowledge transfer, but a major challenge is heterogeneity in clinical concepts across different sites. Large Language Models (LLMs) show significant potential of capturing the semantic meaning of clinical concepts and reducing heterogeneity. This study analyzed electronic health records from two large healthcare systems to assess the impact of semantic embeddings from LLMs on local, shared, and transfer learning models. Results indicate that domain-specific LLMs, such as Med-BERT, consistently outperform in local and direct transfer scenarios, while generic models like OpenAI embeddings require fine-tuning for optimal performance. However, excessive tuning of models with biomedical embeddings may reduce effectiveness, emphasizing the need for balance. This study highlights the importance of domain-specific embeddings and careful model tuning for effective knowledge transfer in healthcare.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2409.13893

Country: North America > United States (0.46)

Genre:

Research Report (1.00)
Overview (0.94)

Industry:

Health & Medicine > Health Care Technology > Medical Record (0.88)
Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (0.71)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

RAG-RLRC-LaySum at BioLaySumm: Integrating Retrieval-Augmented Generation and Readability Control for Layman Summarization of Biomedical Texts

Ji, Yuelyu, Li, Zhuochun, Meng, Rui, Sivarajkumar, Sonish, Wang, Yanshan, Yu, Zeshui, Ji, Hui, Han, Yushui, Zeng, Hanyu, He, Daqing

arXiv.org Artificial IntelligenceJun-24-2024

This paper introduces the RAG-RLRC-LaySum framework, designed to make complex biomedical research understandable to laymen through advanced Natural Language Processing (NLP) techniques. Our Retrieval Augmented Generation (RAG) solution, enhanced by a reranking method, utilizes multiple knowledge sources to ensure the precision and pertinence of lay summaries. Additionally, our Reinforcement Learning for Readability Control (RLRC) strategy improves readability, making scientific content comprehensible to non-specialists. Evaluations using the publicly accessible PLOS and eLife datasets show that our methods surpass Plain Gemini model, demonstrating a 20% increase in readability scores, a 15% improvement in ROUGE-2 relevance scores, and a 10% enhancement in factual accuracy. The RAG-RLRC-LaySum framework effectively democratizes scientific knowledge, enhancing public engagement with biomedical discoveries.

large language model, machine learning, natural language, (16 more...)

arXiv.org Artificial Intelligence

2405.13179

Country:

North America > Canada (0.14)
Europe > Italy (0.14)
Asia > Thailand (0.14)

Genre: Research Report (0.40)

Industry: Health & Medicine (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Assertion Detection Large Language Model In-context Learning LoRA Fine-tuning

Ji, Yuelyu, Yu, Zeshui, Wang, Yanshan

arXiv.org Artificial IntelligenceJan-31-2024

In this study, we aim to address the task of assertion detection when extracting medical concepts from clinical notes, a key process in clinical natural language processing (NLP). Assertion detection in clinical NLP usually involves identifying assertion types for medical concepts in the clinical text, namely certainty (whether the medical concept is positive, negated, possible, or hypothetical), temporality (whether the medical concept is for present or the past history), and experiencer (whether the medical concept is described for the patient or a family member). These assertion types are essential for healthcare professionals to quickly and clearly understand the context of medical conditions from unstructured clinical texts, directly influencing the quality and outcomes of patient care. Although widely used, traditional methods, particularly rule-based NLP systems and machine learning or deep learning models, demand intensive manual efforts to create patterns and tend to overlook less common assertion types, leading to an incomplete understanding of the context. To address this challenge, our research introduces a novel methodology that utilizes Large Language Models (LLMs) pre-trained on a vast array of medical data for assertion detection. We enhanced the current method with advanced reasoning techniques, including Tree of Thought (ToT), Chain of Thought (CoT), and Self-Consistency (SC), and refine it further with Low-Rank Adaptation (LoRA) fine-tuning. We first evaluated the model on the i2b2 2010 assertion dataset. Our method achieved a micro-averaged F-1 of 0.89, with 0.11 improvements over the previous works. To further assess the generalizability of our approach, we extended our evaluation to a local dataset that focused on sleep concept extraction. Our approach achieved an F-1 of 0.74, which is 0.31 higher than the previous method.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2401.17602

Genre: Research Report > New Finding (1.00)

Industry:

Health & Medicine > Therapeutic Area > Neurology (0.69)
Health & Medicine > Therapeutic Area > Psychiatry/Psychology (0.68)
Health & Medicine > Diagnostic Medicine (0.68)
Health & Medicine > Health Care Technology > Medical Record (0.50)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Improving Emotional Expression and Cohesion in Image-Based Playlist Description and Music Topics: A Continuous Parameterization Approach

Ji, Yuelyu, Song, Yuheng, Wang, Wei, Xu, Ruoyi, Xie, Zhongqian, Liu, Huiyun

arXiv.org Artificial IntelligenceOct-12-2023

Text generation in image-based platforms, particularly for music-related content, requires precise control over text styles and the incorporation of emotional expression. However, existing approaches often need help to control the proportion of external factors in generated text and rely on discrete inputs, lacking continuous control conditions for desired text generation. This study proposes Continuous Parameterization for Controlled Text Generation (CPCTG) to overcome these limitations. Our approach leverages a Language Model (LM) as a style learner, integrating Semantic Cohesion (SC) and Emotional Expression Proportion (EEP) considerations. By enhancing the reward method and manipulating the CPCTG level, our experiments on playlist description and music topic generation tasks demonstrate significant improvements in ROUGE scores, indicating enhanced relevance and coherence in the generated text.

artificial intelligence, natural language, playlist description and music topic, (3 more...)

arXiv.org Artificial Intelligence

2310.01248

Genre: Research Report (0.66)

Technology:

Information Technology > Artificial Intelligence > Natural Language (0.93)
Information Technology > Artificial Intelligence > Cognitive Science > Emotion (0.80)

Add feedback

Prediction of COVID-19 Patients' Emergency Room Revisit using Multi-Source Transfer Learning

Ji, Yuelyu, Gao, Yuhe, Bao, Runxue, Li, Qi, Liu, Disheng, Sun, Yiming, Ye, Ye

arXiv.org Artificial IntelligenceJun-29-2023

The coronavirus disease 2019 (COVID-19) has led to a global pandemic of significant severity. In addition to its high level of contagiousness, COVID-19 can have a heterogeneous clinical course, ranging from asymptomatic carriers to severe and potentially life-threatening health complications. Many patients have to revisit the emergency room (ER) within a short time after discharge, which significantly increases the workload for medical staff. Early identification of such patients is crucial for helping physicians focus on treating life-threatening cases. In this study, we obtained Electronic Health Records (EHRs) of 3,210 encounters from 13 affiliated ERs within the University of Pittsburgh Medical Center between March 2020 and January 2021. We leveraged a Natural Language Processing technique, ScispaCy, to extract clinical concepts and used the 1001 most frequent concepts to develop 7-day revisit models for COVID-19 patients in ERs. The research data we collected from 13 ERs may have distributional differences that could affect the model development. To address this issue, we employed a classic deep transfer learning method called the Domain Adversarial Neural Network (DANN) and evaluated different modeling strategies, including the Multi-DANN algorithm, the Single-DANN algorithm, and three baseline methods. Results showed that the Multi-DANN models outperformed the Single-DANN models and baseline models in predicting revisits of COVID-19 patients to the ER within 7 days after discharge. Notably, the Multi-DANN strategy effectively addressed the heterogeneity among multiple source domains and improved the adaptation of source data to the target domain. Moreover, the high performance of Multi-DANN models indicates that EHRs are informative for developing a prediction model to identify COVID-19 patients who are very likely to revisit an ER within 7 days after discharge.

artificial intelligence, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2306.17257

Country: North America > United States (0.94)

Genre: Research Report > New Finding (1.00)

Industry:

Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)
Health & Medicine > Therapeutic Area > Immunology (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)

Add feedback