AITopics | Xu, Xiaofeng

Collaborating Authors

Xu, Xiaofeng

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Group Preference Alignment: Customized LLM Response Generation from In-Situ Conversations

Mondal, Ishani, Stokes, Jack W., Jauhar, Sujay Kumar, Yang, Longqi, Wan, Mengting, Xu, Xiaofeng, Song, Xia, Neville, Jennifer

arXiv.org Artificial IntelligenceMar-11-2025

LLMs often fail to meet the specialized needs of distinct user groups due to their one-size-fits-all training paradigm \cite{lucy-etal-2024-one} and there is limited research on what personalization aspects each group expect. To address these limitations, we propose a group-aware personalization framework, Group Preference Alignment (GPA), that identifies context-specific variations in conversational preferences across user groups and then steers LLMs to address those preferences. Our approach consists of two steps: (1) Group-Aware Preference Extraction, where maximally divergent user-group preferences are extracted from real-world conversation logs and distilled into interpretable rubrics, and (2) Tailored Response Generation, which leverages these rubrics through two methods: a) Context-Tuned Inference (GAP-CT), that dynamically adjusts responses via context-dependent prompt instructions, and b) Rubric-Finetuning Inference (GPA-FT), which uses the rubrics to generate contrastive synthetic data for personalization of group-specific models via alignment. Experiments demonstrate that our framework significantly improves alignment of the output with respect to user preferences and outperforms baseline methods, while maintaining robust performance on standard benchmarks.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2503.08035

Country:

Asia (1.00)
North America > United States > Maryland (0.14)
Europe > Middle East > Malta (0.14)

Genre: Research Report > New Finding (0.46)

Industry: Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

GenTool: Enhancing Tool Generalization in Language Models through Zero-to-One and Weak-to-Strong Simulation

He, Jie, Neville, Jennifer, Wan, Mengting, Yang, Longqi, Liu, Hui, Xu, Xiaofeng, Song, Xia, Pan, Jeff Z., Zhou, Pei

arXiv.org Artificial IntelligenceFeb-26-2025

Large Language Models (LLMs) can enhance their capabilities as AI assistants by integrating external tools, allowing them to access a wider range of information. While recent LLMs are typically fine-tuned with tool usage examples during supervised fine-tuning (SFT), questions remain about their ability to develop robust tool-usage skills and can effectively generalize to unseen queries and tools. In this work, we present GenTool, a novel training framework that prepares LLMs for diverse generalization challenges in tool utilization. Our approach addresses two fundamental dimensions critical for real-world applications: Zero-to-One Generalization, enabling the model to address queries initially lacking a suitable tool by adopting and utilizing one when it becomes available, and Weak-to-Strong Generalization, allowing models to leverage enhanced versions of existing tools to solve queries. To achieve this, we develop synthetic training data simulating these two dimensions of tool usage and introduce a two-stage fine-tuning approach: optimizing tool ranking, then refining tool selection. Through extensive experiments across four generalization scenarios, we demonstrate that our method significantly enhances the tool-usage capabilities of LLMs ranging from 1B to 8B parameters, achieving performance that surpasses GPT-4o. Furthermore, our analysis also provides valuable insights into the challenges LLMs encounter in tool generalization.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2502.1899

Country: North America > Mexico > Mexico City (0.14)

Genre: Research Report (0.82)

Industry:

Banking & Finance (0.68)
Transportation > Passenger (0.67)
Consumer Products & Services > Travel (0.46)
Transportation > Ground (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

A High Energy-Efficiency Multi-core Neuromorphic Architecture for Deep SNN Training

Li, Mingjing, Zhou, Huihui, Xu, Xiaofeng, Zhong, Zhiwei, Quan, Puli, Zhu, Xueke, Lin, Yanyu, Lin, Wenjie, Guo, Hongyu, Zhang, Junchao, Ma, Yunhao, Wang, Wei, Meng, Qingyan, Ma, Zhengyu, Li, Guoqi, Cui, Xiaoxin, Tian, Yonghong

arXiv.org Artificial IntelligenceDec-29-2024

There is a growing necessity for edge training to adapt to dynamically changing environment. Neuromorphic computing represents a significant pathway for high-efficiency intelligent computation in energy-constrained edges, but existing neuromorphic architectures lack the ability of directly training spiking neural networks (SNNs) based on backpropagation. We develop a multi-core neuromorphic architecture with Feedforward-Propagation, Back-Propagation, and Weight-Gradient engines in each core, supporting high efficient parallel computing at both the engine and core levels. It combines various data flows and sparse computation optimization by fully leveraging the sparsity in SNN training, obtaining a high energy efficiency of 1.05TFLOPS/W@ FP16 @ 28nm, 55 ~ 85% reduction of DRAM access compared to A100 GPU in SNN trainings, and a 20-core deep SNN training and a 5-worker federated learning on FPGAs. Our study develops the first multi-core neuromorphic architecture supporting the direct SNN training, facilitating the neuromorphic computing in edge-learnable applications.

artificial intelligence, deep learning, machine learning, (15 more...)

arXiv.org Artificial Intelligence

2412.05302

Country: Asia > China (0.28)

Genre: Research Report (0.50)

Industry:

Information Technology (0.93)
Energy > Oil & Gas (0.50)
Education > Educational Setting > Online (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

Interpretable User Satisfaction Estimation for Conversational Systems with Large Language Models

Lin, Ying-Chun, Neville, Jennifer, Stokes, Jack W., Yang, Longqi, Safavi, Tara, Wan, Mengting, Counts, Scott, Suri, Siddharth, Andersen, Reid, Xu, Xiaofeng, Gupta, Deepak, Jauhar, Sujay Kumar, Song, Xia, Buscher, Georg, Tiwary, Saurabh, Hecht, Brent, Teevan, Jaime

arXiv.org Artificial IntelligenceJun-8-2024

Accurate and interpretable user satisfaction estimation (USE) is critical for understanding, evaluating, and continuously improving conversational systems. Users express their satisfaction or dissatisfaction with diverse conversational patterns in both general-purpose (ChatGPT and Bing Copilot) and task-oriented (customer service chatbot) conversational systems. Existing approaches based on featurized ML models or text embeddings fall short in extracting generalizable patterns and are hard to interpret. In this work, we show that LLMs can extract interpretable signals of user satisfaction from their natural language utterances more effectively than embedding-based approaches. Moreover, an LLM can be tailored for USE via an iterative prompting framework using supervision from labeled examples. The resulting method, Supervised Prompting for User satisfaction Rubrics (SPUR), not only has higher accuracy but is more interpretable as it scores user satisfaction via learned rubrics with a detailed breakdown.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2403.12388

Country:

Europe (1.00)
North America > United States (0.46)
Asia > Middle East > UAE (0.14)

Genre: Research Report > New Finding (0.46)

Industry: Information Technology (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.68)

Add feedback

Target-Independent Active Learning via Distribution-Splitting

Cao, Xiaofeng, Tsang, Ivor W., Xu, Xiaofeng, Xu, Guandong

arXiv.org Machine LearningSep-28-2018

To reduce the label complexity in Agnostic Active Learning (A^2 algorithm), volume-splitting splits the hypothesis edges to reduce the Vapnik-Chervonenkis (VC) dimension in version space. However, the effectiveness of volume-splitting critically depends on the initial hypothesis and this problem is also known as target-dependent label complexity gap. This paper attempts to minimize this gap by introducing a novel notion of number density which provides a more natural and direct way to describe the hypothesis distribution than volume. By discovering the connections between hypothesis and input distribution, we map the volume of version space into the number density and propose a target-independent distribution-splitting strategy with the following advantages: 1) provide theoretical guarantees on reducing label complexity and error rate as volume-splitting; 2) break the curse of initial hypothesis; 3) provide model guidance for a target-independent AL algorithm in real AL tasks. With these guarantees, for AL application, we then split the input distribution into more near-optimal spheres and develop an application algorithm called Distribution-based A^2 (DA^2). Experiments further verify the effectiveness of the halving and querying abilities of DA^2. Contributions of this paper are as follows.

active learning, artificial intelligence, machine learning, (18 more...)

arXiv.org Machine Learning

1809.10962

Country: Oceania > Australia (0.14)

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback