AITopics | Zhang, Yuwei

Collaborating Authors

Zhang, Yuwei

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

New Intent Discovery with Pre-training and Contrastive Learning

Zhang, Yuwei, Zhang, Haode, Zhan, Li-Ming, Lam, Albert Y. S., Wu, Xiao-Ming

arXiv.org Artificial IntelligenceApr-6-2025

New intent discovery aims to uncover novel intent categories from user utterances to expand the set of supported intent classes. It is a critical task for the development and service expansion of a practical dialogue system. Despite its importance, this problem remains under-explored in the literature. Existing approaches typically rely on a large amount of labeled utterances and employ pseudo-labeling methods for representation learning and clustering, which are label-intensive, inefficient, and inaccurate. In this paper, we provide new solutions to two important research questions for new intent discovery: (1) how to learn semantic utterance representations and (2) how to better cluster utterances. Particularly, we first propose a multi-task pre-training strategy to leverage rich unlabeled data along with external labeled data for representation learning. Then, we design a new contrastive loss to exploit self-supervisory signals in unlabeled data for clustering. Extensive experiments on three intent recognition benchmarks demonstrate the high effectiveness of our proposed method, which outperforms state-of-the-art methods by a large margin in both unsupervised and semi-supervised scenarios. The source code will be available at https://github.com/zhang-yu-wei/MTP-CLNN.

artificial intelligence, computational linguistic, machine learning, (15 more...)

arXiv.org Artificial Intelligence

2205.12914

Country:

Asia (0.68)
North America > United States > Minnesota (0.28)
North America > United States > California (0.28)

Genre:

Research Report > New Finding (0.34)
Research Report > Promising Solution (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Unsupervised or Indirectly Supervised Learning (0.69)

Add feedback

Attention Reveals More Than Tokens: Training-Free Long-Context Reasoning with Attention-guided Retrieval

Zhang, Yuwei, Srinivasa, Jayanth, Liu, Gaowen, Shang, Jingbo

arXiv.org Artificial IntelligenceMar-12-2025

Large Language Models (LLMs) often exhibit substantially shorter effective context lengths than their claimed capacities, especially when handling complex reasoning tasks that require integrating information from multiple parts of a long context and performing multi-step reasoning. Although Chain-of-Thought (CoT) prompting has shown promise in reducing task complexity, our empirical analysis reveals that it does not fully resolve this limitation. Through controlled experiments, we identify poor recall of implicit facts as the primary cause of failure, which significantly hampers reasoning performance. Interestingly, we observe that the internal attention weights from the generated CoT tokens can effectively ground implicit facts, even when these facts are not explicitly recalled. Building on this insight, we propose a novel training-free algorithm, Attrieval, which leverages attention weights to retrieve relevant facts from the long context and incorporates them into the reasoning process. Additionally, we find that selecting context tokens from CoT tokens further improves performance. Our results demonstrate that Attrieval enhances long-context reasoning capability notably on both synthetic and real-world QA datasets with various models.

large language model, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2503.09819

Country:

Europe (0.28)
North America > United States > California (0.14)
Asia > Thailand (0.14)

Genre: Research Report > New Finding (0.86)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.31)

Add feedback

Toward Multi-Session Personalized Conversation: A Large-Scale Dataset and Hierarchical Tree Framework for Implicit Reasoning

Li, Xintong, Bantupalli, Jalend, Dharmani, Ria, Zhang, Yuwei, Shang, Jingbo

arXiv.org Artificial IntelligenceMar-10-2025

There has been a surge in the use of large language models (LLM) conversational agents to generate responses based on long-term history from multiple sessions. However, existing long-term open-domain dialogue datasets lack complex, real-world personalization and fail to capture implicit reasoning-where relevant information is embedded in subtle, syntactic, or semantically distant connections rather than explicit statements. In such cases, traditional retrieval methods fail to capture relevant context, and long-context modeling also becomes inefficient due to numerous complicated persona-related details. To address this gap, we introduce ImplexConv, a large-scale long-term dataset with 2,500 examples, each containing approximately 100 conversation sessions, designed to study implicit reasoning in personalized dialogues. Additionally, we propose TaciTree, a novel hierarchical tree framework that structures conversation history into multiple levels of summarization. Instead of brute-force searching all data, TaciTree enables an efficient, level-based retrieval process where models refine their search by progressively selecting relevant details. Our experiments demonstrate that TaciTree significantly improves the ability of LLMs to reason over long-term conversations with implicit contextual dependencies.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2503.07018

Country:

Europe (0.15)
North America > United States > California (0.14)

Genre: Research Report (0.64)

Industry:

Media > Music (0.46)
Leisure & Entertainment > Sports > Basketball (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Recent Advances, Applications and Open Challenges in Machine Learning for Health: Reflections from Research Roundtables at ML4H 2024 Symposium

Adibi, Amin, Cao, Xu, Ji, Zongliang, Kaur, Jivat Neet, Chen, Winston, Healey, Elizabeth, Nuwagira, Brighton, Ye, Wenqian, Woollard, Geoffrey, Xu, Maxwell A, Cui, Hejie, Xi, Johnny, Chang, Trenton, Bikia, Vasiliki, Zhang, Nicole, Noori, Ayush, Xia, Yuan, Hossain, Md. Belal, Frank, Hanna A., Peluso, Alina, Pu, Yuan, Shen, Shannon Zejiang, Wu, John, Fallahpour, Adibvafa, Mahbub, Sazan, Duncan, Ross, Zhang, Yuwei, Cao, Yurui, Xu, Zuheng, Craig, Michael, Krishnan, Rahul G., Beheshti, Rahmatollah, Rehg, James M., Karim, Mohammad Ehsanul, Coffee, Megan, Celi, Leo Anthony, Fries, Jason Alan, Sadatsafavi, Mohsen, Shung, Dennis, McWeeney, Shannon, Dafflon, Jessica, Jabbour, Sarah

arXiv.org Artificial IntelligenceFeb-10-2025

The fourth Machine Learning for Health (ML4H) symposium was held in person on December 15th and 16th, 2024, in the traditional, ancestral, and unceded territories of the Musqueam, Squamish, and Tsleil-Waututh Nations in Vancouver, British Columbia, Canada. The symposium included research roundtable sessions to foster discussions between participants and senior researchers on timely and relevant topics for the ML4H community. The organization of the research roundtables at the conference involved 13 senior and 27 junior chairs across 13 tables. Each roundtable session included an invited senior chair (with substantial experience in the field), junior chairs (responsible for facilitating the discussion), and attendees from diverse backgrounds with an interest in the session's topic.

data mining, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2502.06693

Country:

North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.24)
North America > United States > New York > New York County (0.14)
Europe > United Kingdom > England > Oxfordshire (0.14)

Genre:

Research Report > Experimental Study (1.00)
Research Report > Promising Solution (0.92)
Research Report > New Finding (0.67)

Industry:

Law (1.00)
Information Technology > Security & Privacy (1.00)
Health & Medicine > Therapeutic Area > Psychiatry/Psychology (1.00)
(15 more...)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
(4 more...)

Add feedback

Object Detection for Medical Image Analysis: Insights from the RT-DETR Model

He, Weijie, Zhang, Yuwei, Xu, Ting, An, Tai, Liang, Yingbin, Zhang, Bo

arXiv.org Artificial IntelligenceJan-27-2025

Deep learning has emerged as a transformative approach for solving complex pattern recognition and object detection challenges. This paper focuses on the application of a novel detection framework based on the RT-DETR model for analyzing intricate image data, particularly in areas such as diabetic retinopathy detection. Diabetic retinopathy, a leading cause of vision loss globally, requires accurate and efficient image analysis to identify early-stage lesions. The proposed RT-DETR model, built on a Transformer-based architecture, excels at processing high-dimensional and complex visual data with enhanced robustness and accuracy. Comparative evaluations with models such as YOLOv5, YOLOv8, SSD, and DETR demonstrate that RT-DETR achieves superior performance across precision, recall, mAP50, and mAP50-95 metrics, particularly in detecting small-scale objects and densely packed targets. This study underscores the potential of Transformer-based models like RT-DETR for advancing object detection tasks, offering promising applications in medical imaging and beyond.

artificial intelligence, detection, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2501.16469

Country:

North America > United States > Massachusetts (0.14)
North America > United States > California (0.14)

Genre: Research Report (1.00)

Industry: Health & Medicine > Therapeutic Area > Ophthalmology/Optometry (1.00)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

LongMemEval: Benchmarking Chat Assistants on Long-Term Interactive Memory

Wu, Di, Wang, Hongwei, Yu, Wenhao, Zhang, Yuwei, Chang, Kai-Wei, Yu, Dong

arXiv.org Artificial IntelligenceOct-14-2024

Recent large language model (LLM)-driven chat assistant systems have integrated memory components to track user-assistant chat histories, enabling more accurate and personalized responses. However, their long-term memory capabilities in sustained interactions remain underexplored. This paper introduces LongMemEval, a comprehensive benchmark designed to evaluate five core long-term memory abilities of chat assistants: information extraction, multi-session reasoning, temporal reasoning, knowledge updates, and abstention. With 500 meticulously curated questions embedded within freely scalable user-assistant chat histories, LongMemEval presents a significant challenge to existing long-term memory systems, with commercial chat assistants and long-context LLMs showing 30% accuracy drop on memorizing information across sustained interactions. We then present a unified framework that breaks down the long-term memory design into four design choices across the indexing, retrieval, and reading stages. Built upon key experimental insights, we propose several memory designs including session decomposition for optimizing value granularity, fact-augmented key expansion for enhancing the index structure, and time-aware query expansion for refining the search scope. Experiment results show that these optimizations greatly improve both memory recall and downstream question answering on LongMemEval. Overall, our study provides valuable resources and guidance for advancing the long-term memory capabilities of LLM-based chat assistants, paving the way toward more personalized and reliable conversational AI.

information, large language model, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2410.10813

Country:

North America > United States (1.00)
Asia (1.00)
Europe (0.93)

Genre: Research Report > New Finding (0.87)

Industry:

Information Technology (0.68)
Health & Medicine (0.67)
Leisure & Entertainment > Sports (0.67)
Media > Music (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Cognitive Science (1.00)

Add feedback

RespLLM: Unifying Audio and Text with Multimodal LLMs for Generalized Respiratory Health Prediction

Zhang, Yuwei, Xia, Tong, Saeed, Aaqib, Mascolo, Cecilia

arXiv.org Artificial IntelligenceOct-7-2024

The high incidence and mortality rates associated with respiratory diseases underscores the importance of early screening. Machine learning models can automate clinical consultations and auscultation, offering vital support in this area. However, the data involved, spanning demographics, medical history, symptoms, and respiratory audio, are heterogeneous and complex. Existing approaches are insufficient and lack generalizability, as they typically rely on limited training data, basic fusion techniques, and task-specific models. In this paper, we propose RespLLM, a novel multimodal large language model (LLM) framework that unifies text and audio representations for respiratory health prediction. RespLLM leverages the extensive prior knowledge of pretrained LLMs and enables effective audio-text fusion through cross-modal attentions. Instruction tuning is employed to integrate diverse data from multiple sources, ensuring generalizability and versatility of the model. Experiments on five real-world datasets demonstrate that RespLLM outperforms leading baselines by an average of 4.6% on trained tasks, 7.9% on unseen datasets, and facilitates zero-shot predictions for new tasks. Our work lays the foundation for multimodal models that can perceive, listen to, and understand heterogeneous data, paving the way for scalable respiratory health diagnosis.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2410.05361

Country:

Europe > Portugal (0.28)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)

Genre: Research Report (1.00)

Industry:

Health & Medicine > Therapeutic Area > Pulmonary/Respiratory Diseases (1.00)
Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)
Health & Medicine > Therapeutic Area > Immunology (0.96)
Health & Medicine > Diagnostic Medicine (0.93)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Jump Diffusion-Informed Neural Networks with Transfer Learning for Accurate American Option Pricing under Data Scarcity

Sun, Qiguo, Huang, Hanyue, Yang, XiBei, Zhang, Yuwei

arXiv.org Artificial IntelligenceSep-26-2024

Option pricing models, essential in financial mathematics and risk management, have been extensively studied and recently advanced by AI methodologies. However, American option pricing remains challenging due to the complexity of determining optimal exercise times and modeling non-linear payoffs resulting from stochastic paths. Moreover, the prevalent use of the Black-Scholes formula in hybrid models fails to accurately capture the discontinuity in the price process, limiting model performance, especially under scarce data conditions. To address these issues, this study presents a comprehensive framework for American option pricing consisting of six interrelated modules, which combine nonlinear optimization algorithms, analytical and numerical models, and neural networks to improve pricing performance. Additionally, to handle the scarce data challenge, this framework integrates the transfer learning through numerical data augmentation and a physically constrained, jump diffusion process-informed neural network to capture the leptokurtosis of the log return distribution. To increase training efficiency, a warm-up period using Bayesian optimization is designed to provide optimal data loss and physical loss coefficients. Experimental results of six case studies demonstrate the accuracy, convergence, physical effectiveness, and generalization of the framework. Moreover, the proposed model shows superior performance in pricing deep out-of-the-money options. Introduction Options are fundamental financial derivatives widely employed for risk management. The movement of option prices follows a stochastic process influenced by various factors such as the price process of the underlying assets ( S t), the strike price (K), the time-to-maturity ( T), the option type (American or European; Put ( P) or Call ( C) options), and numerous macroeconomic and market factors.

artificial intelligence, machine learning, option pricing, (16 more...)

arXiv.org Artificial Intelligence

2409.18168

Country:

North America > United States > New York (0.14)
Europe > Germany > Bavaria (0.14)

Genre: Research Report > New Finding (0.46)

Industry:

Banking & Finance > Trading (1.00)
Energy > Oil & Gas > Trading (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Non-stationary BERT: Exploring Augmented IMU Data For Robust Human Activity Recognition

Sun, Ning, Wang, Yufei, Zhang, Yuwei, Wan, Jixiang, Wang, Shenyue, Liu, Ping, Zhang, Xudong

arXiv.org Artificial IntelligenceSep-25-2024

Human Activity Recognition (HAR) has gained great attention from researchers due to the popularity of mobile devices and the need to observe users' daily activity data for better human-computer interaction. In this work, we collect a human activity recognition dataset called OPPOHAR consisting of phone IMU data. To facilitate the employment of HAR system in mobile phone and to achieve user-specific activity recognition, we propose a novel light-weight network called Non-stationary BERT with a two-stage training method. We also propose a simple yet effective data augmentation method to explore the deeper relationship between the accelerator and gyroscope data from the IMU. The network achieves the state-of-the-art performance testing on various activity recognition datasets and the data augmentation method demonstrates its wide applicability.

data mining, machine learning, recognition, (16 more...)

arXiv.org Artificial Intelligence

2409.1673

Country: Asia > China (0.31)

Genre: Research Report (0.50)

Industry:

Health & Medicine (0.68)
Information Technology (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Communications > Mobile (0.89)
Information Technology > Human Computer Interaction (0.87)
Information Technology > Data Science > Data Mining (0.69)

Add feedback

Speculative RAG: Enhancing Retrieval Augmented Generation through Drafting

Wang, Zilong, Wang, Zifeng, Le, Long, Zheng, Huaixiu Steven, Mishra, Swaroop, Perot, Vincent, Zhang, Yuwei, Mattapalli, Anush, Taly, Ankur, Shang, Jingbo, Lee, Chen-Yu, Pfister, Tomas

arXiv.org Artificial IntelligenceJul-11-2024

Retrieval augmented generation (RAG) combines the generative abilities of large language models (LLMs) with external knowledge sources to provide more accurate and up-to-date responses. Recent RAG advancements focus on improving retrieval outcomes through iterative LLM refinement or self-critique capabilities acquired through additional instruction tuning of LLMs. In this work, we introduce Speculative RAG - a framework that leverages a larger generalist LM to efficiently verify multiple RAG drafts produced in parallel by a smaller, distilled specialist LM. Each draft is generated from a distinct subset of retrieved documents, offering diverse perspectives on the evidence while reducing input token counts per draft. This approach enhances comprehension of each subset and mitigates potential position bias over long context. Our method accelerates RAG by delegating drafting to the smaller specialist LM, with the larger generalist LM performing a single verification pass over the drafts. Extensive experiments demonstrate that Speculative RAG achieves state-of-the-art performance with reduced latency on TriviaQA, MuSiQue, PubHealth, and ARC-Challenge benchmarks. It notably enhances accuracy by up to 12.97% while reducing latency by 51% compared to conventional RAG systems on PubHealth.

arxiv preprint arxiv, large language model, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2407.08223

Country: North America > United States > California (0.14)

Genre: Research Report > New Finding (0.67)

Industry:

Leisure & Entertainment (0.93)
Media > Film (0.67)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback