AITopics | He, Daqing

Collaborating Authors

He, Daqing

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Memory-Aware and Uncertainty-Guided Retrieval for Multi-Hop Question Answering

Ji, Yuelyu, Meng, Rui, Li, Zhuochun, He, Daqing

arXiv.org Artificial IntelligenceMar-29-2025

Multi-hop question answering (QA) requires models to retrieve and reason over multiple pieces of evidence. While Retrieval-Augmented Generation (RAG) has made progress in this area, existing methods often suffer from two key limitations: (1) fixed or overly frequent retrieval steps, and (2) ineffective use of previously retrieved knowledge. We propose MIND (Memory-Informed and INteractive Dynamic RAG), a framework that addresses these challenges through: (i) prompt-based entity extraction to identify reasoning-relevant elements, (ii) dynamic retrieval triggering based on token-level entropy and attention signals, and (iii) memory-aware filtering, which stores high-confidence facts across reasoning steps to enable consistent multi-hop generation.

large language model, machine learning, question answering, (18 more...)

arXiv.org Artificial Intelligence

2503.23095

Country: North America > United States (0.70)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.94)
Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (0.89)
Information Technology > Artificial Intelligence > Natural Language > Question Answering (0.62)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.51)

Add feedback

ReasoningRank: Teaching Student Models to Rank through Reasoning-Based Knowledge Distillation

Ji, Yuelyu, Li, Zhuochun, Meng, Rui, He, Daqing

arXiv.org Artificial IntelligenceNov-30-2024

Reranking documents based on their relevance to a given query is a critical task in information retrieval. Traditional reranking methods often lack transparency and rely on proprietary models, hindering reproducibility and interpretability. We propose Reason-to-Rank (R2R), a novel open-source reranking approach that enhances transparency by generating two types of reasoning: direct relevance reasoning, which explains how a document addresses the query, and comparison reasoning, which justifies the relevance of one document over another. We leverage large language models (LLMs) as teacher models to generate these explanations and distill this knowledge into smaller, openly available student models. Our student models are trained to generate meaningful reasoning and rerank documents, achieving competitive performance across multiple datasets, including MSMARCO and BRIGHT. Experiments demonstrate that R2R not only improves reranking accuracy but also provides valuable insights into the decision-making process. By offering a structured and interpretable solution with openly accessible resources, R2R aims to bridge the gap between effectiveness and transparency in information retrieval, fostering reproducibility and further research in the field.

large language model, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2410.05168

Genre: Research Report > Experimental Study (0.93)

Industry:

Health & Medicine (1.00)
Education > Educational Technology > Educational Software (0.83)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.70)

Add feedback

Learning from Committee: Reasoning Distillation from a Mixture of Teachers with Peer-Review

Li, Zhuochun, Ji, Yuelyu, Meng, Rui, He, Daqing

arXiv.org Artificial IntelligenceOct-15-2024

While reasoning capabilities typically emerge in large language models (LLMs) with tens of billions of parameters, recent research focuses on improving smaller open-source models through knowledge distillation (KD) from commercial LLMs. However, many of these studies rely solely on responses from a single LLM as the gold rationale, unlike the natural human learning process, which involves understanding both the correct answers and the reasons behind mistakes. In this paper, we introduce a novel Fault-Aware Distillation via Peer-Review (FAIR) approach: 1) Instead of merely obtaining gold rationales from teachers, our method asks teachers to identify and explain the student's mistakes, providing customized instruction learning data. 2) We design a simulated peer-review process between teacher LLMs, which selects only the generated rationales above the acceptance threshold. This reduces the chance of teachers guessing correctly with flawed rationale, improving instructional data quality. Comprehensive experiments and analysis on mathematical, commonsense, and logical reasoning tasks demonstrate the effectiveness of our method.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2410.03663

Country:

Asia > India (0.14)
Asia > China (0.14)

Genre: Research Report (1.00)

Industry: Education (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.95)

Add feedback

Traffic Light or Light Traffic? Investigating Phrasal Semantics in Large Language Models

Meng, Rui, Liu, Ye, Tu, Lifu, He, Daqing, Zhou, Yingbo, Yavuz, Semih

arXiv.org Artificial IntelligenceOct-3-2024

Phrases are fundamental linguistic units through which humans convey semantics. This study critically examines the capacity of API-based large language models (LLMs) to comprehend phrase semantics, utilizing three human-annotated datasets. We assess the performance of LLMs in executing phrase semantic reasoning tasks guided by natural language instructions and explore the impact of common prompting techniques, including few-shot demonstrations and Chain-of-Thought reasoning. Our findings reveal that LLMs greatly outperform traditional embedding methods across the datasets; however, they do not show a significant advantage over fine-tuned methods. The effectiveness of advanced prompting strategies shows variability. We conduct detailed error analyses to interpret the limitations faced by LLMs in comprehending phrase semantics. Code and data can be found at https://github.com/memray/llm_phrase_semantics.

large language model, natural language, semantic relatedness, (17 more...)

arXiv.org Artificial Intelligence

2410.02308

Genre: Research Report > New Finding (0.67)

Industry:

Transportation > Infrastructure & Services (0.50)
Transportation > Ground > Road (0.50)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

RAG-RLRC-LaySum at BioLaySumm: Integrating Retrieval-Augmented Generation and Readability Control for Layman Summarization of Biomedical Texts

Ji, Yuelyu, Li, Zhuochun, Meng, Rui, Sivarajkumar, Sonish, Wang, Yanshan, Yu, Zeshui, Ji, Hui, Han, Yushui, Zeng, Hanyu, He, Daqing

arXiv.org Artificial IntelligenceJun-24-2024

This paper introduces the RAG-RLRC-LaySum framework, designed to make complex biomedical research understandable to laymen through advanced Natural Language Processing (NLP) techniques. Our Retrieval Augmented Generation (RAG) solution, enhanced by a reranking method, utilizes multiple knowledge sources to ensure the precision and pertinence of lay summaries. Additionally, our Reinforcement Learning for Readability Control (RLRC) strategy improves readability, making scientific content comprehensible to non-specialists. Evaluations using the publicly accessible PLOS and eLife datasets show that our methods surpass Plain Gemini model, demonstrating a 20% increase in readability scores, a 15% improvement in ROUGE-2 relevance scores, and a 10% enhancement in factual accuracy. The RAG-RLRC-LaySum framework effectively democratizes scientific knowledge, enhancing public engagement with biomedical discoveries.

large language model, machine learning, natural language, (16 more...)

arXiv.org Artificial Intelligence

2405.13179

Country:

North America > Canada (0.14)
Europe > Italy (0.14)
Asia > Thailand (0.14)

Genre: Research Report (0.40)

Industry: Health & Medicine (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Effects of Different Prompts on the Quality of GPT-4 Responses to Dementia Care Questions

Li, Zhuochun, Xie, Bo, Hilsabeck, Robin, Aguirre, Alyssa, Zou, Ning, Luo, Zhimeng, He, Daqing

arXiv.org Artificial IntelligenceApr-5-2024

Evidence suggests that different prompts lead large language models (LLMs) to generate responses with varying quality. Yet, little is known about prompts' effects on response quality in healthcare domains. In this exploratory study, we address this gap, focusing on a specific healthcare domain: dementia caregiving. We first developed an innovative prompt template with three components: (1) system prompts (SPs) featuring 4 different roles; (2) an initialization prompt; and (3) task prompts (TPs) specifying different levels of details, totaling 12 prompt combinations. Next, we selected 3 social media posts containing complicated, real-world questions about dementia caregivers' challenges in 3 areas: memory loss and confusion, aggression, and driving. We then entered these posts into GPT-4, with our 12 prompts, to generate 12 responses per post, totaling 36 responses. We compared the word count of the 36 responses to explore potential differences in response length. Two experienced dementia care clinicians on our team assessed the response quality using a rating scale with 5 quality indicators: factual, interpretation, application, synthesis, and comprehensiveness (scoring range: 0-5; higher scores indicate higher quality).

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2404.08674

Country: North America > United States > Texas (0.29)

Genre: Research Report > New Finding (0.46)

Industry: Health & Medicine > Therapeutic Area > Neurology > Dementia (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.92)

Add feedback

General-to-Specific Transfer Labeling for Domain Adaptable Keyphrase Generation

Meng, Rui, Wang, Tong, Yuan, Xingdi, Zhou, Yingbo, He, Daqing

arXiv.org Artificial IntelligenceMay-7-2023

Training keyphrase generation (KPG) models require a large amount of annotated data, which can be prohibitively expensive and often limited to specific domains. In this study, we first demonstrate that large distribution shifts among different domains severely hinder the transferability of KPG models. We then propose a three-stage pipeline, which gradually guides KPG models' learning focus from general syntactical features to domain-related semantics, in a data-efficient manner. With Domain-general Phrase pre-training, we pre-train Sequence-to-Sequence models with generic phrase annotations that are widely available on the web, which enables the models to generate phrases in a wide range of domains. The resulting model is then applied in the Transfer Labeling stage to produce domain-specific pseudo keyphrases, which help adapt models to a new domain. Finally, we fine-tune the model with limited data with true labels to fully adapt it to the target domain. Our experiment results show that the proposed process can produce good-quality keyphrases in new domains and achieve consistent improvements after adaptation with limited in-domain annotated data. All code and datasets are available at https://github.com/memray/OpenNMT-kpg-release.

artificial intelligence, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2208.09606

Country:

Europe (0.93)
North America > United States (0.67)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Communications (0.95)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)

Add feedback

Integrating Transformer and Paraphrase Rules for Sentence Simplification

Zhao, Sanqiang, Meng, Rui, He, Daqing, Andi, Saptono, Bambang, Parmanto

arXiv.org Artificial IntelligenceOct-26-2018

Sentence simplification aims to reduce the complexity of a sentence while retaining its original meaning. Current models for sentence simplification adopted ideas from ma- chine translation studies and implicitly learned simplification mapping rules from normal- simple sentence pairs. In this paper, we explore a novel model based on a multi-layer and multi-head attention architecture and we pro- pose two innovative approaches to integrate the Simple PPDB (A Paraphrase Database for Simplification), an external paraphrase knowledge base for simplification that covers a wide range of real-world simplification rules. The experiments show that the integration provides two major benefits: (1) the integrated model outperforms multiple state- of-the-art baseline models for sentence simplification in the literature (2) through analysis of the rule utilization, the model seeks to select more accurate simplification rules. The code and models used in the paper are available at https://github.com/ Sanqiang/text_simplification.

deep learning, neural network, sentence simplification, (18 more...)

arXiv.org Artificial Intelligence

1810.11193

Genre:

Research Report > Promising Solution (0.68)
Research Report > New Finding (0.68)

Industry: Health & Medicine (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.70)

Add feedback