AITopics | Xi, Xiangyu

Plotting

Xi, Xiangyu

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

SampleMix: A Sample-wise Pre-training Data Mixing Strategey by Coordinating Data Quality and Diversity

Xi, Xiangyu, Kong, Deyang, Yang, Jian, Yang, Jiawei, Chen, Zhengyu, Wang, Wei, Wang, Jingang, Cai, Xunliang, Zhang, Shikun, Ye, Wei

arXiv.org Artificial IntelligenceMar-3-2025

Existing pretraining data mixing methods for large language models (LLMs) typically follow a domain-wise methodology, a top-down process that first determines domain weights and then performs uniform data sampling across each domain. However, these approaches neglect significant inter-domain overlaps and commonalities, failing to control the global diversity of the constructed training dataset. Further, uniform sampling within domains ignores fine-grained sample-specific features, potentially leading to suboptimal data distribution. To address these shortcomings, we propose a novel sample-wise data mixture approach based on a bottom-up paradigm. This method performs global cross-domain sampling by systematically evaluating the quality and diversity of each sample, thereby dynamically determining the optimal domain distribution. Comprehensive experiments across multiple downstream tasks and perplexity assessments demonstrate that SampleMix surpasses existing domain-based methods. Meanwhile, SampleMix requires 1.4x to 2.1x training steps to achieves the baselines' performance, highlighting the substantial potential of SampleMix to optimize pre-training data.

large language model, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2503.01506

Country:

Asia > Thailand (0.14)
Asia > China (0.14)

Genre: Research Report (0.82)

Industry: Leisure & Entertainment > Sports (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback

Dialog-to-Actions: Building Task-Oriented Dialogue System via Action-Level Generation

Hua, Yuncheng, Xi, Xiangyu, Jiang, Zheng, Zhang, Guanwei, Sun, Chaobo, Wan, Guanglu, Ye, Wei

arXiv.org Artificial IntelligenceApr-3-2023

End-to-end generation-based approaches have been investigated and applied in task-oriented dialogue systems. However, in industrial scenarios, existing methods face the bottlenecks of controllability (e.g., domain-inconsistent responses, repetition problem, etc) and efficiency (e.g., long computation time, etc). In this paper, we propose a task-oriented dialogue system via action-level generation. Specifically, we first construct dialogue actions from large-scale dialogues and represent each natural language (NL) response as a sequence of dialogue actions. Further, we train a Sequence-to-Sequence model which takes the dialogue history as input and outputs sequence of dialogue actions. The generated dialogue actions are transformed into verbal responses. Experimental results show that our light-weighted method achieves competitive performance, and has the advantage of controllability and efficiency.

dialogue action, machine learning, natural language, (16 more...)

arXiv.org Artificial Intelligence

2304.00884

Country:

Europe (0.46)
North America > United States (0.31)

Genre: Research Report (0.70)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.49)

Add feedback

MUSIED: A Benchmark for Event Detection from Multi-Source Heterogeneous Informal Texts

Xi, Xiangyu, Lv, Jianwei, Liu, Shuaipeng, Ye, Wei, Yang, Fan, Wan, Guanglu

arXiv.org Artificial IntelligenceNov-25-2022

Event detection (ED) identifies and classifies event triggers from unstructured texts, serving as a fundamental task for information extraction. Despite the remarkable progress achieved in the past several years, most research efforts focus on detecting events from formal texts (e.g., news articles, Wikipedia documents, financial announcements). Moreover, the texts in each dataset are either from a single source or multiple yet relatively homogeneous sources. With massive amounts of user-generated text accumulating on the Web and inside enterprises, identifying meaningful events in these informal texts, usually from multiple heterogeneous sources, has become a problem of significant practical value. As a pioneering exploration that expands event detection to the scenarios involving informal and heterogeneous texts, we propose a new large-scale Chinese event detection dataset based on user reviews, text conversations, and phone conversations in a leading e-commerce platform for food service. We carefully investigate the proposed dataset's textual informality and multi-source heterogeneity characteristics by inspecting data samples quantitatively and qualitatively. Extensive experiments with state-of-the-art event detection methods verify the unique challenges posed by these characteristics, indicating that multi-source informal event detection remains an open problem and requires further efforts. Our benchmark and code are released at \url{https://github.com/myeclipse/MUSIED}.

computational linguistic, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2211.13896

Country:

Europe (0.68)
North America > United States > Minnesota (0.28)

Genre: Research Report (0.82)

Industry:

Consumer Products & Services > Food, Beverage, Tobacco & Cannabis (0.48)
Information Technology > Services (0.34)
Consumer Products & Services > Restaurants (0.34)

Technology:

Information Technology > Communications (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.88)

Add feedback