AITopics | Chen, Wenxiang

Collaborating Authors

Chen, Wenxiang

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Better Process Supervision with Bi-directional Rewarding Signals

Chen, Wenxiang, He, Wei, Xi, Zhiheng, Guo, Honglin, Hong, Boyang, Zhang, Jiazheng, Zheng, Rui, Li, Nijun, Gui, Tao, Li, Yun, Zhang, Qi, Huang, Xuanjing

arXiv.org Artificial IntelligenceMar-6-2025

Process supervision, i.e., evaluating each step, is critical for complex large language model (LLM) reasoning and test-time searching with increased inference compute. Existing approaches, represented by process reward models (PRMs), primarily focus on rewarding signals up to the current step, exhibiting a one-directional nature and lacking a mechanism to model the distance to the final target. To address this problem, we draw inspiration from the A* algorithm, which states that an effective supervisory signal should simultaneously consider the incurred cost and the estimated cost for reaching the target. Building on this key insight, we introduce BiRM, a novel process supervision model that not only evaluates the correctness of previous steps but also models the probability of future success. We conduct extensive experiments on mathematical reasoning tasks and demonstrate that BiRM provides more precise evaluations of LLM reasoning steps, achieving an improvement of 3.1% on Gaokao2023 over PRM under the Best-of-N sampling method. Besides, in search-based strategies, BiRM provides more comprehensive guidance and outperforms ORM by 5.0% and PRM by 3.8% respectively on MATH-500.

large language model, machine learning, natural language, (21 more...)

arXiv.org Artificial Intelligence

2503.04618

Country:

Asia (0.67)
Europe > Austria > Vienna (0.14)
North America > Mexico > Mexico City (0.14)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)

Add feedback

Training Large Language Models for Reasoning through Reverse Curriculum Reinforcement Learning

Xi, Zhiheng, Chen, Wenxiang, Hong, Boyang, Jin, Senjie, Zheng, Rui, He, Wei, Ding, Yiwen, Liu, Shichun, Guo, Xin, Wang, Junzhe, Guo, Honglin, Shen, Wei, Fan, Xiaoran, Zhou, Yuhao, Dou, Shihan, Wang, Xiao, Zhang, Xinbo, Sun, Peng, Gui, Tao, Zhang, Qi, Huang, Xuanjing

arXiv.org Artificial IntelligenceFeb-8-2024

In this paper, we propose R$^3$: Learning Reasoning through Reverse Curriculum Reinforcement Learning (RL), a novel method that employs only outcome supervision to achieve the benefits of process supervision for large language models. The core challenge in applying RL to complex reasoning is to identify a sequence of actions that result in positive rewards and provide appropriate supervision for optimization. Outcome supervision provides sparse rewards for final results without identifying error locations, whereas process supervision offers step-wise rewards but requires extensive manual annotation. R$^3$ overcomes these limitations by learning from correct demonstrations. Specifically, R$^3$ progressively slides the start state of reasoning from a demonstration's end to its beginning, facilitating easier model exploration at all stages. Thus, R$^3$ establishes a step-wise curriculum, allowing outcome supervision to offer step-level signals and precisely pinpoint errors. Using Llama2-7B, our method surpasses RL baseline on eight reasoning tasks by $4.1$ points on average. Notebaly, in program-based reasoning on GSM8K, it exceeds the baseline by $4.2$ points across three backbone models, and without any extra data, Codellama-7B + R$^3$ performs comparable to larger models or closed-source models.

large language model, machine learning, reinforcement learning, (17 more...)

arXiv.org Artificial Intelligence

2402.05808

Country:

Europe (1.00)
North America > United States > California > Santa Clara County (0.14)

Genre: Research Report > New Finding (0.67)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

The Rise and Potential of Large Language Model Based Agents: A Survey

Xi, Zhiheng, Chen, Wenxiang, Guo, Xin, He, Wei, Ding, Yiwen, Hong, Boyang, Zhang, Ming, Wang, Junzhe, Jin, Senjie, Zhou, Enyu, Zheng, Rui, Fan, Xiaoran, Wang, Xiao, Xiong, Limao, Zhou, Yuhao, Wang, Weiran, Jiang, Changhao, Zou, Yicheng, Liu, Xiangyang, Yin, Zhangyue, Dou, Shihan, Weng, Rongxiang, Cheng, Wensen, Zhang, Qi, Qin, Wenjuan, Zheng, Yongyan, Qiu, Xipeng, Huang, Xuanjing, Gui, Tao

arXiv.org Artificial IntelligenceSep-19-2023

For a long time, humanity has pursued artificial intelligence (AI) equivalent to or surpassing the human level, with AI agents considered a promising vehicle for this pursuit. AI agents are artificial entities that sense their environment, make decisions, and take actions. Many efforts have been made to develop intelligent agents, but they mainly focus on advancement in algorithms or training strategies to enhance specific capabilities or performance on particular tasks. Actually, what the community lacks is a general and powerful model to serve as a starting point for designing AI agents that can adapt to diverse scenarios. Due to the versatile capabilities they demonstrate, large language models (LLMs) are regarded as potential sparks for Artificial General Intelligence (AGI), offering hope for building general AI agents. Many researchers have leveraged LLMs as the foundation to build AI agents and have achieved significant progress. In this paper, we perform a comprehensive survey on LLM-based agents. We start by tracing the concept of agents from its philosophical origins to its development in AI, and explain why LLMs are suitable foundations for agents. Building upon this, we present a general framework for LLM-based agents, comprising three main components: brain, perception, and action, and the framework can be tailored for different applications. Subsequently, we explore the extensive applications of LLM-based agents in three aspects: single-agent scenarios, multi-agent scenarios, and human-agent cooperation. Following this, we delve into agent societies, exploring the behavior and personality of LLM-based agents, the social phenomena that emerge from an agent society, and the insights they offer for human society. Finally, we discuss several key topics and open problems within the field. A repository for the related papers at https://github.com/WooooDyy/LLM-Agent-Paper-List.

computer science, large language model, machine learning, (21 more...)

arXiv.org Artificial Intelligence

2309.07864

Country:

North America > Canada (1.00)
Oceania (0.92)
Asia > Middle East (0.67)
(4 more...)

Genre:

Overview (1.00)
Instructional Material (0.92)
Research Report > New Finding (0.92)

Industry:

Law (1.00)
Information Technology > Security & Privacy (1.00)
Health & Medicine > Therapeutic Area > Neurology (1.00)
(5 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents > Agent Societies (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Stochastic Local Search over Minterms on Structured SAT Instances

Chen, Wenxiang (Colorado State University) | Whitley, Darrell (Colorado State University) | Howe, Adele (Colorado State University) | Goldman, Brian (Colorado State University)

AAAI ConferencesJun-14-2016

We observed that Conjunctive Normal Form (CNF) encodings of structured SAT instances often have a set of consecutive clauses defined over a small number of Boolean variables. To exploit the pattern, we propose a transformation of CNF to an alternative representation, Conjunctive Minterm Canonical Form (CMCF). The transformation is a two-step process: CNF clauses are first partitioned into disjoint subsets such that each subset contains CNF clauses with shared Boolean variables. CNF clauses in each subset are then replaced by Minterm Canonical Form (i.e., partial solutions), which is found by enumeration. We show empirically that a simple Stochastic Local Search (SLS) solver based on CMCF can consistently achieve a higher success rate using fewer evaluations than the SLS solver WalkSAT on two representative classes of structured SAT problems.

minterm, stochastic local search

AAAI Conferences

Ninth Annual Symposium on Combinatorial Search

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.60)

Add feedback