AITopics | agent system

Collaborating Authors

agent system

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Can Agents Fix Agent Issues?

Neural Information Processing SystemsJun-16-2026, 22:51:26 GMT

LLM-based agent systems are emerging as a new software paradigm and have been widely adopted across diverse domains such as medicine, robotics, and programming. However, maintaining these systems requires substantial effort, as they are inevitably prone to bugs and continually evolve to meet changing external requirements. Therefore, automatically resolving agent issues (i.e., bug reports or feature requests) is a crucial and challenging task. While recent software engineering (SE) agents (e.g., SWE-agent) have shown promise in addressing issues in traditional software systems, it remains unclear how effectively they can resolve real-world issues in agent systems, which differ significantly from traditional software. To fill this gap, we first manually analyze 201 real-world agent issues and identify common categories of agent issues. We then spend 500 person-hours constructing AGENTISSUE-BENCH, a reproducible benchmark comprising 50 agent issue resolution tasks (each with an executable environment and failure-triggering tests). We further evaluate state-of-the-art SE agents on AGENTISSUE-BENCH and reveal their limited effectiveness (i.e., with only 0.67% - 4.67% resolution rates). These results underscore the unique challenges of maintaining agent systems compared to traditional software, highlighting the need for further research to develop advanced SE agents for resolving agent issues.

large language model, machine learning, natural language, (20 more...)

Neural Information Processing Systems

Country:

Asia (1.00)
North America > United States (0.46)
Europe > Austria (0.28)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)

Add feedback

Can Agent Fix Agent Issues?

Neural Information Processing SystemsJun-12-2026, 00:02:35 GMT

LLM-based agent systems are emerging as a new software paradigm and have been widely adopted across diverse domains such as medicine, robotics, and programming. However, maintaining these systems requires substantial effort, as they are inevitably prone to bugs and continually evolve to meet changing external requirements. Therefore, automatically resolving agent issues (i.e.,bug reports or feature requests) is a crucial and challenging task. While recent software engineering (SE) agents (e.g., SWE-agent) have shown promise in addressing issues in traditional software systems, it remains unclear how effectively they can resolve real-world issues in agent systems, which differ significantly from traditional software. To fill this gap, we first manually analyze 201 real-world agent issues and identify common categories of agent issues. We then spend 500 person-hours constructing AgentIssue-bench, a reproducible benchmark comprising 50 agent issue resolution tasks (each with an executable environment and failure-triggering tests). We further evaluate state-of-the-art SE agents on AgentIssue-bench and reveal their limited effectiveness (.e., with only 0.67% - 4.67% resolution rates). These results underscore the unique challenges of maintaining agent systems compared to traditional software, highlighting the need for further research to develop advanced SE agents for resolving agent issues.

agent issue, artificial intelligence, proceedings, (5 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence (1.00)

Add feedback

GAIR: GUI Automation via Information-Joint Reasoning and Group Reflection

Wei, Zishu, Ma, Qixiang, Hu, Xavier, Liu, Yuhang, Zang, Hui, Zhao, Yudong, Wang, Tao, Zhang, Shengyu, Wu, Fei

arXiv.org Artificial IntelligenceDec-11-2025

Building AI systems for GUI automation task has attracted remarkable research efforts, where MLLMs are leveraged for processing user requirements and give operations. However, GUI automation includes a wide range of tasks, from document processing to online shopping, from CAD to video editing. Diversity between particular tasks requires MLLMs for GUI automation to have heterogeneous capabilities and master multidimensional expertise, raising problems on constructing such a model. To address such challenge, we propose GAIR: GUI Automation via Information-Joint Reasoning and Group Reflection, a novel MLLM-based GUI automation agent framework designed for integrating knowledge and combining capabilities from heterogeneous models to build GUI automation agent systems with higher performance. Since different GUI-specific MLLMs are trained on different dataset and thus have different strengths, GAIR introduced a general-purpose MLLM for jointly processing the information from multiple GUI-specific models, further enhancing performance of the agent framework. The general-purpose MLLM also serves as decision maker, trying to execute a reasonable operation based on previously gathered information. When the general-purpose model thinks that there isn't sufficient information for a reasonable decision, GAIR would transit into group reflection status, where the general-purpose model would provide GUI-specific models with different instructions and hints based on their strengths and weaknesses, driving them to gather information with more significance and accuracy that can support deeper reasoning and decision. We evaluated the effectiveness and reliability of GAIR through extensive experiments on GUI benchmarks.

artificial intelligence, information fusion, natural language, (15 more...)

arXiv.org Artificial Intelligence

2512.09396

Country:

Asia (0.93)
North America > United States (0.69)
Europe > Austria > Vienna (0.15)

Genre: Research Report (0.85)

Technology:

Information Technology > Graphics (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Information Fusion (0.86)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.52)

Add feedback

Curriculum Guided Massive Multi Agent System Solving For Robust Long Horizon Tasks

Kar, Indrajit, Kumar, Kalathur Chenchu Kishore

arXiv.org Artificial IntelligenceDec-10-2025

Large Language Models and multi-agent systems have shown promise in decomposing complex tasks, yet they struggle with long-horizon reasoning tasks and escalating computation cost. This work introduces a hierarchical multi-agent architecture that distributes reasoning across a 64*64 grid of lightweight agents, supported by a selective oracle. A spatial curriculum progressively expands the operational region of the grid, ensuring that agents master easier central tasks before tackling harder peripheral ones. To improve reliability, the system integrates Negative Log-Likelihood as a measure of confidence, allowing the curriculum to prioritize regions where agents are both accurate and well calibrated. A Thompson Sampling curriculum manager adaptively chooses training zones based on competence and NLL-driven reward signals. We evaluate the approach on a spatially grounded Tower of Hanoi benchmark, which mirrors the long-horizon structure of many robotic manipulation and planning tasks. Results demonstrate improved stability, reduced oracle usage, and stronger long-range reasoning from distributed agent cooperation.

agent, artificial intelligence, curriculum, (17 more...)

arXiv.org Artificial Intelligence

2512.08545

Country: Asia > Vietnam > Hanoi > Hanoi (0.26)

Genre: Research Report > New Finding (0.34)

Industry:

Information Technology (0.46)
Education (0.46)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)

Add feedback

Can Agents Fix Agent Issues?

Rahardja, Alfin Wijaya, Liu, Junwei, Chen, Weitong, Chen, Zhenpeng, Lou, Yiling

arXiv.org Artificial IntelligenceOct-27-2025

LLM-based agent systems are emerging as a new software paradigm and have been widely adopted across diverse domains such as medicine, robotics, and programming. However, maintaining these systems requires substantial effort, as they are inevitably prone to bugs and continually evolve to meet changing external requirements. Therefore, automatically resolving agent issues (i.e., bug reports or feature requests) is a crucial and challenging task. While recent software engineering (SE) agents (e.g., SWE-agent) have shown promise in addressing issues in traditional software systems, it remains unclear how effectively they can resolve real-world issues in agent systems, which differ significantly from traditional software. To fill this gap, we first manually analyze 201 real-world agent issues and identify common categories of agent issues. We then spend 500 person-hours constructing AgentIssue-Bench, a reproducible benchmark comprising 50 agent issue resolution tasks (each with an executable environment and failure-triggering tests). We further evaluate state-of-the-art SE agents on AgentIssue-Bench and reveal their limited effectiveness (i.e., with only 0.67% - 4.67% resolution rates). These results underscore the unique challenges of maintaining agent systems compared to traditional software, highlighting the need for further research to develop advanced SE agents for resolving agent issues. Data and code are available at https://github.com/alfin06/AgentIssue-Bench.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2505.20749

Country:

Asia (1.00)
North America > United States (0.28)
Europe > Austria > Vienna (0.14)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)

Add feedback

AUGUSTUS: An LLM-Driven Multimodal Agent System with Contextualized User Memory

Jain, Jitesh, Maheshwari, Shubham, Yu, Ning, Hwu, Wen-mei, Shi, Humphrey

arXiv.org Artificial IntelligenceOct-20-2025

Riding on the success of LLMs with retrieval-augmented generation (RAG), there has been a growing interest in augmenting agent systems with external memory databases. However, the existing systems focus on storing text information in their memory, ignoring the importance of multimodal signals. Motivated by the multimodal nature of human memory, we present AUGUSTUS, a multimodal agent system aligned with the ideas of human memory in cognitive science. Technically, our system consists of 4 stages connected in a loop: (i) encode: understanding the inputs; (ii) store in memory: saving important information; (iii) retrieve: searching for relevant context from memory; and (iv) act: perform the task. Unlike existing systems that use vector databases, we propose conceptualizing information into semantic tags and associating the tags with their context to store them in a graph-structured multimodal contextual memory for efficient concept-driven retrieval. Our system outperforms the traditional multimodal RAG approach while being 3.5 times faster for ImageNet classification and outperforming MemGPT on the MSC benchmark.

information, large language model, machine learning, (20 more...)

arXiv.org Artificial Intelligence

2510.15261

Country: North America > United States (0.68)

Genre:

Research Report (0.50)
Workflow (0.46)

Industry:

Health & Medicine (0.70)
Government > Regional Government > North America Government > United States Government (0.46)
Leisure & Entertainment (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

8a75ee6d4b2eb0b777f549a32a5a5c28-Supplemental-Datasets_and_Benchmarks_Track.pdf

Neural Information Processing SystemsOct-10-2025, 08:51:41 GMT

final answer, query, rtx 4 0 7 0, (15 more...)

Neural Information Processing Systems

Country:

North America > United States (0.67)
Asia > China > Shanghai > Shanghai (0.04)

Genre: Research Report (0.67)

Industry:

Consumer Products & Services (0.67)
Government > Regional Government > North America Government > United States Government (0.46)
Transportation > Air (0.46)
Food & Agriculture > Agriculture (0.46)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.68)
Information Technology > Communications (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.47)

Add feedback

A Study on the MCP x A2A Framework for Enhancing Interoperability of LLM-based Autonomous Agents

Jeong, Cheonsu

arXiv.org Artificial IntelligenceOct-3-2025

This paper provides an in-depth technical analysis and implementation methodology of the open-source Agent-to-Agent (A2A) protocol developed by Google and the Model Context Protocol (MCP) introduced by Anthropic. While the evolution of LLM-based autonomous agents is rapidly accelerating, efficient interactions among these agents and their integration with external systems remain significant challenges. In modern AI systems, collaboration between autonomous agents and integration with external tools have become essential elements for building practical AI applications. A2A offers a standardized communication method that enables agents developed in heterogeneous environments to collaborate effectively, while MCP provides a structured I/O framework for agents to connect with external tools and resources. Prior studies have focused primarily on the features and applications of either A2A or MCP individually. In contrast, this study takes an integrated approach, exploring how the two protocols can complement each other to address interoperability issues and facilitate efficient collaboration within complex agent ecosystems.

agent, artificial intelligence, natural language, (15 more...)

arXiv.org Artificial Intelligence

doi: 10.13088/jiis.2025.31.3.141

2506.01804

Genre:

Research Report (1.00)
Workflow (0.69)

Industry:

Information Technology > Security & Privacy (1.00)
Banking & Finance (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)

Add feedback

Improving the Efficiency of LLM Agent Systems through Trajectory Reduction

Xiao, Yuan-An, Gao, Pengfei, Peng, Chao, Xiong, Yingfei

arXiv.org Artificial IntelligenceSep-30-2025

Multi-turn agent systems based on Large Language Models (LLMs) have been increasingly popular for software engineering tasks. While LLM agents show decent effectiveness, the high computational cost of input tokens due to the ever-growing trajectory remains an efficiency concern for their applications. Efficiency is largely neglected in existing studies and agent products, and this paper fills the gap by introducing an inference-time trajectory reduction approach to reduce the cost of agents. Through analyzing existing agent trajectories, we demonstrate that useless, redundant, and expired information is widespread in all trajectories, which can be identified and reduced without harming the agent's performance. We then design a simple yet effective trajectory reduction approach, AgentDiet, which automatically removes such waste information. We implement AgentDiet on a top-performing coding agent, and the evaluation on two LLMs and two benchmarks shows that AgentDiet can reduce input tokens by 39.9% ~ 59.7%, or the final computational cost by 21.1% ~ 35.9%, while maintaining the same agent performance. This indicates that trajectory reduction is a promising direction for agent systems.

large language model, machine learning, trajectory, (19 more...)

arXiv.org Artificial Intelligence

2509.23586

Country: Asia (0.28)

Genre: Research Report > New Finding (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.70)

Add feedback

Shapes of Cognition for Computational Cognitive Modeling

McShane, Marjorie, Nirenburg, Sergei, Oruganti, Sanjay, English, Jesse

arXiv.org Artificial IntelligenceSep-17-2025

Shapes of cognition is a new conceptual paradigm for the computational cognitive modeling of Language - Endowed Intelligent Agents (LEIAs) . S hapes are remembered constellations of sensory, linguistic, conceptual, episodic, and procedural knowledge that allow agents to cut through the complexity of real life the same way as people do: by expecting things to be typical, recognizing patterns, acting by habit, reasoning by analogy, satisficing, and generally minimizing cognitive load to the degree situations permit . Atypical outcomes are treated using shapes - based recovery method s, such as learning on the fly, asking a human partner for help, or seeking an actionable, even if imperfect, situational understanding . Although shapes is an umbrella term, it is not vague: shapes - based modeling involves particular objectives, hypotheses, modeling strategies, knowledge bases, and actual models of wide - ranging phenomena, all implemented within a particular cognitive architecture . Such s pecificity is needed both to vet the our hypotheses and to achieve our practical aims of building useful agent systems that are explainable, extensible, and worthy of our trust, even in critical domains . However, a lthough the LEIA example of shapes - based modeling is specific, the principles can be applied more broadly, giving new life to knowledge - based and hybrid AI .

artificial intelligence, expert system, simulation of human behavior, (20 more...)

arXiv.org Artificial Intelligence

2509.13288

Country:

North America > United States (0.28)
Europe > United Kingdom > England (0.14)

Genre: Research Report (0.64)

Industry: Health & Medicine (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning > Expert Systems (0.87)
(2 more...)

Add feedback