AITopics

Multi-objective reinforcement learning (MORL) plays a pivotal role in addressing multi-criteria decision-making problems in the real world. The multi-policy (MP) based methods are widely used to obtain high-quality Pareto front approximation for the MORL problems. However, traditional MP methods only rely on the online reinforcement learning (RL) and adopt the evolutionary framework with a large policy population. This may lead to sample inefficiency and/or overwhelmed agent-environment interactions in practice. By forsaking the evolutionary framework, we propose the novel Multi-policy Pareto Front Tracking (MPFT) framework without maintaining any policy population, where both online and offline MORL algorithms can be applied. The proposed MPFT framework includes four stages: Stage 1 approximates all the Pareto-vertex policies, whose mapping to the objective space fall on the vertices of the Pareto front. Stage 2 designs the new Pareto tracking mechanism to track the Pareto front, starting from each of the Pareto-vertex policies. Stage 3 identifies the sparse regions in the tracked Pareto front, and introduces a new objective weight adjustment method to fill the sparse regions. Finally, by combining all the policies tracked in Stages 2 and 3, Stage 4 approximates the Pareto front. Experiments are conducted on seven different continuous-action robotic control tasks with both online and offline MORL algorithms, and demonstrate the superior hypervolume performance of our proposed MPFT approach over the state-of-the-art benchmarks, with significantly reduced agent-environment interactions and hardware requirements.

artificial intelligence, machine learning, reinforcement learning, (18 more...)

2508.02217

Country: Asia (0.28)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.93)

A Survey on AgentOps: Categorization, Challenges, and Future Directions

Wang, Zexin, Li, Jingjing, Zhou, Quan, Si, Haotian, Liu, Yuanhao, Li, Jianhui, Xie, Gaogang, Sun, Fei, Pei, Dan, Pei, Changhua

As the reasoning capabilities of Large Language Models (LLMs) continue to advance, LLM-based agent systems offer advantages in flexibility and interpretability over traditional systems, garnering increasing attention. However, despite the widespread research interest and industrial application of agent systems, these systems, like their traditional counterparts, frequently encounter anomalies. These anomalies lead to instability and insecurity, hindering their further development. Therefore, a comprehensive and systematic approach to the operation and maintenance of agent systems is urgently needed. Unfortunately, current research on the operations of agent systems is sparse. To address this gap, we have undertaken a survey on agent system operations with the aim of establishing a clear framework for the field, defining the challenges, and facilitating further development. Specifically, this paper begins by systematically defining anomalies within agent systems, categorizing them into intra-agent anomalies and inter-agent anomalies. Next, we introduce a novel and comprehensive operational framework for agent systems, dubbed Agent System Operations (AgentOps). We provide detailed definitions and explanations of its four key stages: monitoring, anomaly detection, root cause analysis, and resolution.

artificial intelligence, machine learning, natural language, (16 more...)

2508.02121

Country: Asia > China (0.68)

Genre: Workflow (0.88)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)

Don't Overthink It: A Survey of Efficient R1-style Large Reasoning Models

Yue, Linan, Du, Yichao, Wang, Yizhi, Gao, Weibo, Yao, Fangzhou, Wang, Li, Liu, Ye, Xu, Ziyu, Liu, Qi, Di, Shimin, Zhang, Min-Ling

Recently, Large Reasoning Models (LRMs) have gradually become a research hotspot due to their outstanding performance in handling complex tasks. Among them, DeepSeek R1 has garnered significant attention for its exceptional performance and open-source nature, driving advancements in the research of R1-style LRMs. Unlike traditional Large Language Models (LLMs), these models enhance logical deduction and decision-making capabilities during reasoning by incorporating mechanisms such as long chain-of-thought and self-reflection through reinforcement learning. However, with the widespread application of these models, the problem of overthinking has gradually emerged. Specifically, when generating answers, these models often construct excessively long reasoning chains with redundant or repetitive steps, which leads to reduced reasoning efficiency and may affect the accuracy of the final answer. To this end, various efficient reasoning methods have been proposed, aiming to reduce the length of reasoning paths without compromising model performance and reasoning capability. By reviewing the current research advancements in the field of efficient reasoning methods systematically, we categorize existing works into two main directions based on the lens of single-model optimization versus model collaboration: (1) Efficient Reasoning with Single Model, which focuses on improving the reasoning efficiency of individual models; and (2) Efficient Reasoning with Model Collaboration, which explores optimizing reasoning paths through collaboration among multiple models. Besides, we maintain a public GitHub repository that tracks the latest progress in efficient reasoning methods.

arxiv preprint arxiv, large language model, machine learning, (17 more...)

2508.0212

Country: Europe > Austria (0.46)

Genre: Overview (1.00)

Industry: Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
(2 more...)

Everyone Contributes! Incentivizing Strategic Cooperation in Multi-LLM Systems via Sequential Public Goods Games

Liang, Yunhao, Qu, Yuan, Yang, Jingyuan, Lin, Shaochong, Shen, Zuo-Jun Max

Coordinating multiple large language models (LLMs) to solve complex tasks collaboratively poses a fundamental trade-off between the computation costs and collective performance compared with individual model. We introduce a novel, game-theoretically grounded reinforcement learning (RL) framework, the Multi-Agent Cooperation Sequential Public Goods Game (MAC-SPGG), to systematically incen-tivize cooperation in multi-LLM ensembles. In MAC-SPGG, LLM agents move in sequence, observing predecessors' outputs and updating beliefs to condition their own contributions. By redesigning the public-goods reward, effortful contributions become the unique Subgame Perfect Nash Equilibrium (SPNE), which eliminates free-riding under traditional SPGG or PGG. Its sequential protocol replaces costly round-based information exchanges with a streamlined decision flow, cutting communication overhead while retaining strategic depth. We prove the existence and uniqueness of the SPNE under realistic parameters, and empirically show that MAC-SPGG-trained ensembles outperform single-agent baselines, chain-of-thought prompting, and other cooperative methods, even achieving comparable performance to large-scale models across reasoning, math, code generation, and NLP tasks. Our results highlight the power of structured, incentive-aligned MAC-SPGG cooperation for scalable and robust multi-agent language generation.

artificial intelligence, large language model, natural language, (17 more...)

2508.02076

Country:

Asia > China (0.46)
North America > United States > California (0.28)

Genre: Research Report > New Finding (0.66)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents > Agent Societies (0.46)

A Group Consensus-Driven Auction Algorithm for Cooperative Task Allocation Among Heterogeneous Multi-Agents

Wang, Gang, Han, Hongfang, Liu, Xiaowei, Jiang, Hanfeng, Zhang, Ming

In scenarios like automated warehouses, assigning tasks to robots presents a heterogeneous multi-task and multi-agent task allocation problem. However, existing task allocation study ignores the integration of multi-task and multi-attribute agent task allocation with heterogeneous task allocation. In addition, current algorithms are limited by scenario constraints and can incur significant errors in specific contexts. Therefore, this study proposes a distributed heterogeneous multi-task and multi-agent task allocation algorithm with a time window, called group consensus-based heterogeneous auction (GCBHA). Firstly, this method decomposes tasks that exceed the capability of a single Agent into subtasks that can be completed by multiple independent agents. And then groups similar or adjacent tasks through a heuristic clustering method to reduce the time required to reach a consensus. Subsequently, the task groups are allocated to agents that meet the conditions through an auction process. Furthermore, the method evaluates the task path cost distance based on the scenario, which can calculate the task cost more accurately. The experimental results demonstrate that GCBHA performs well in terms of task allocation time and solution quality, with a significant reduction in the error rate between predicted task costs and actual costs.

agent, artificial intelligence, machine learning, (16 more...)

2508.02015

Country: Asia > China (0.46)

Genre: Research Report > New Finding (0.48)

Industry: Transportation (0.69)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.34)

Buchholz, Markus, Carlucho, Ignacio, Grimaldi, Michele, Petillot, Yvan R.

Distributed AI Agents for Cognitive Underwater Robot Autonomy

Achieving robust cognitive autonomy in robots navigating complex, unpredictable environments remains a fundamental challenge in robotics. This paper presents Underwater Robot Self-Organizing Autonomy (UROSA), a groundbreaking architecture leveraging distributed Large Language Model AI agents integrated within the Robot Operating System 2 (ROS 2) framework to enable advanced cognitive capabilities in Autonomous Underwater Vehicles. UROSA decentralises cognition into specialised AI agents responsible for multimodal perception, adaptive reasoning, dynamic mission planning, and real-time decision-making. Central innovations include flexible agents dynamically adapting their roles, retrieval-augmented generation utilising vector databases for efficient knowledge management, reinforcement learning-driven behavioural optimisation, and autonomous on-the-fly ROS 2 node generation for runtime functional extensibility. Extensive empirical validation demonstrates UROSA's promising adaptability and reliability through realistic underwater missions in simulation and real-world deployments, showing significant advantages over traditional rule-based architectures in handling unforeseen scenarios, environmental uncertainties, and novel mission objectives. This work not only advances underwater autonomy but also establishes a scalable, safe, and versatile cognitive robotics framework capable of generalising to a diverse array of real-world applications.

large language model, machine learning, natural language, (17 more...)

2507.23735

Country:

North America > United States (0.67)
Europe (0.46)

Genre: Research Report > New Finding (0.46)

Industry: Education (0.93)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.66)

Collaborative Medical Triage under Uncertainty: A Multi-Agent Dynamic Matching Approach

Cheng, Hongyan, Yu, Chengzhang, Shi, Yanshu, Wang, Chiyue, Liu, Cong, Jin, Zhanpeng

The post-pandemic surge in healthcare demand, coupled with critical nursing shortages, has placed unprecedented pressure on medical triage systems, necessitating innovative AI-driven solutions. We present a multi-agent interactive intelligent system for medical triage that addresses three fundamental challenges in current AI-based triage systems: inadequate medical specialization leading to misclassification, heterogeneous department structures across healthcare institutions, and inefficient detail-oriented questioning that impedes rapid triage decisions. Our system employs three specialized agents--RecipientAgent, InquirerAgent, and DepartmentAgent--that collaborate through Inquiry Guidance mechanism and Classification Guidance Mechanism to transform unstructured patient symptoms into accurate department recommendations. To ensure robust evaluation, we constructed a comprehensive Chinese medical triage dataset from "Ai Ai Yi Medical Network", comprising 3,360 real-world cases spanning 9 primary departments and 62 secondary departments. Experimental results demonstrate that our multi-agent system achieves 89.6% accuracy in primary department classification and 74.3% accuracy in secondary department classification after four rounds of patient interaction. The system's dynamic matching based guidance mechanisms enable efficient adaptation to diverse hospital configurations while maintaining high triage accuracy. We successfully developed this multi-agent triage system that not only adapts to organizational heterogeneity across healthcare institutions but also ensures clinically sound decision-making.

accuracy, artificial intelligence, machine learning, (16 more...)

2507.22504

Country: North America > United States (0.14)

Genre: Research Report > New Finding (0.48)

Industry:

Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)
Health & Medicine > Health Care Providers & Services (1.00)
Health & Medicine > Therapeutic Area > Neurology (0.69)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Raineri, Roberta, Zino, Lorenzo, Proskurnikov, Anton

FJ-MM: The Friedkin-Johnsen Opinion Dynamics Model with Memory and Higher-Order Neighbors

The Friedkin-Johnsen (FJ) model has been extensively explored and validated, spanning applications in social science, systems and control, game theory, and algorithmic research. In this paper, we introduce an advanced generalization of the FJ model, termed FJ-MM which incorporates both memory effects and multi-hop (higher-order neighbor) influence. This formulation allows agents to naturally incorporate both current and previous opinions at each iteration stage. Our numerical results demonstrate that incorporating memory and multi-hop influence significantly reshapes the opinion landscape; for example, the final opinion profile can exhibit reduced polarization. We analyze the stability and equilibrium properties of the FJ-MM model, showing that these properties can be reduced to those of a comparison model--namely, the standard FJ model with a modified influence matrix. This reduction enables us to leverage established stability results from FJ dynamics. Additionally, we examine the convergence rate of the FJ-MM model and demonstrate that, as can be expected, the time lags introduced by memory and higher-order neighbor influences result in slower convergence.

artificial intelligence, fj model, social media, (18 more...)

doi: 10.1016/j.ejcon.2025.101306

2504.06731

Country: Europe (0.28)

Genre: Research Report > New Finding (0.48)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Communications > Social Media (0.69)

WebDS: An End-to-End Benchmark for Web-based Data Science

Hsu, Ethan, Yam, Hong Meng, Bouissou, Ines, John, Aaron Murali, Thota, Raj, Koe, Josh, Putta, Vivek Sarath, Dharesan, G K, Spangher, Alexander, Murty, Shikhar, Huang, Tenghao, Manning, Christopher D.

A large portion of real-world data science tasks are complex and require multi-hop web-based interactions: finding appropriate data available on the internet, synthesizing real-time data of various modalities from different locations, and producing summarized analyses. Existing web benchmarks often focus on simplistic interactions, such as form submissions or e-commerce transactions, and often do not require diverse tool-using capabilities required for web based data science. Conversely, traditional data science benchmarks typically concentrate on static, often textually bound datasets and do not assess end-to-end workflows that encompass data acquisition, cleaning, analysis, and insight generation. In response, we introduce WebDS, the first end-to-end web-based data science benchmark. It comprises 870 web-based data science tasks across 29 diverse websites from structured government data portals to unstructured news media, challenging agents to perform complex, multi-step operations requiring the use of tools and heterogeneous data formats that better reflect the realities of modern data analytics. Evaluations of current SOTA LLM agents indicate significant performance gaps in accomplishing these tasks. For instance, Browser Use, which accomplishes 80% of tasks on Web Voyager, successfully completes only 15% of tasks in WebDS, which our analysis suggests is due to new failure modes like poor information grounding, repetitive behavior and shortcut-taking that agents performing WebDS' tasks display. By providing a more robust and realistic testing ground, WebDS sets the stage for significant advances in the development of practically useful LLM-based data science.

benchmark, large language model, machine learning, (20 more...)

2508.01222

Country: North America > United States > California (0.28)

Genre: Workflow (0.89)

Industry:

Health & Medicine (0.93)
Media > News (0.49)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.71)

MIT Technology ReviewAug-4-2025, 15:00:13 GMT

These protocols will help AI agents navigate our messy lives

MCP attempts to standardize how AI agents interact with the world via various programs, and it's already very popular. One web aggregator for MCP servers (essentially, the portals for different programs or tools that agents can access) lists over 15,000 servers already. Working out how to govern how AI agents interact with each other is arguably an even steeper challenge, and it's one the Agent2Agent protocol (A2A), introduced by Google in April, tries to take on. Whereas MCP translates requests between words and code, A2A tries to moderate exchanges between agents, which is an "essential next step for the industry to move beyond single-purpose agents," Rao Surapaneni, who works with A2A at Google Cloud, wrote in an email to MIT Technology Review. Google says 150 companies have already partnered with it to develop and adopt A2A, including Adobe and Salesforce.

artificial intelligence, help ai agent navigate, protocol, (5 more...)

MIT Technology Review

Industry: Information Technology > Services (0.99)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)