AITopics

We have seen remarkable progress in large language models (LLMs) empowered multi-agent systems solving complex tasks necessitating cooperation among experts with diverse skills. However, optimizing LLM-based multi-agent systems remains challenging. In this work, we perform an empirical case study on group optimization of role-based multi-agent systems utilizing natural language feedback for challenging software development tasks under various evaluation dimensions. We propose a two-step agent prompts optimization pipeline: identifying underperforming agents with their failure explanations utilizing textual feedback and then optimizing system prompts of identified agents utilizing failure explanations. We then study the impact of various optimization settings on system performance with two comparison groups: online against offline optimization and individual against group optimization. For group optimization, we study two prompting strategies: one-pass and multi-pass prompting optimizations. Overall, we demonstrate the effectiveness of our optimization method for role-based multi-agent systems tackling software development tasks evaluated on diverse evaluation dimensions, and we investigate the impact of diverse optimization settings on group behaviors of the multi-agent systems to provide practical insights for future development.

agent, artificial intelligence, optimization, (15 more...)

2505.16086

Country:

North America > Mexico (0.28)
Asia > Middle East (0.28)
Europe > Austria (0.28)

Genre: Research Report > New Finding (0.46)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Agents > Agent Societies (0.67)

Presenza, J. Francisco, Mas, Ignacio, Alvarez-Hamelin, J. Ignacio, Giribet, Juan I.

Subframework-based Bearing Rigidity Maintenance Control in Multirobot Networks

This work presents a novel approach for \textit{bearing rigidity} analysis and control in multi-robot networks with sensing constraints and dynamic topology. By decomposing the system's framework into \textit{subframeworks}, we express bearing rigidity -- a global property -- as a set of \textit{local} properties, with rigidity eigenvalues serving as natural \textit{local rigidity measures}. We propose a decentralized gradient-based controller to execute mission-specific commands using only bearing measurements. The controller preserves bearing rigidity by keeping the rigidity eigenvalues above a threshold, using only information exchanged within subframeworks. Simulations evaluate the scheme's effectiveness, underscoring its scalability and practicality.

artificial intelligence, rigidity, subframework, (14 more...)

doi: 10.1109/LCSYS.2025.3580768

2504.17103

Country: South America > Argentina (0.15)

Genre: Research Report > Promising Solution (0.34)

Technology:

Information Technology > Communications > Networks (1.00)
Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.46)

A Novel Architecture for Symbolic Reasoning with Decision Trees and LLM Agents

Kiruluta, Andrew

We propose a hybrid architecture that integrates decision tree-based symbolic reasoning with the generative capabilities of large language models (LLMs) within a coordinated multi-agent framework. Unlike prior approaches that loosely couple symbolic and neural modules, our design embeds decision trees and random forests as callable oracles within a unified reasoning system. Tree-based modules enable interpretable rule inference and causal logic, while LLM agents handle abductive reasoning, generalization, and interactive planning. A central orchestrator maintains belief state consistency and mediates communication across agents and external tools, enabling reasoning over both structured and unstructured inputs. The system achieves strong performance on reasoning benchmarks. On \textit{ProofWriter}, it improves entailment consistency by +7.2\% through logic-grounded tree validation. On GSM8k, it achieves +5.3\% accuracy gains in multistep mathematical problems via symbolic augmentation. On \textit{ARC}, it boosts abstraction accuracy by +6.0\% through integration of symbolic oracles. Applications in clinical decision support and scientific discovery show how the system encodes domain rules symbolically while leveraging LLMs for contextual inference and hypothesis generation. This architecture offers a robust, interpretable, and extensible solution for general-purpose neuro-symbolic reasoning.

large language model, machine learning, natural language, (14 more...)

2508.05311

Genre: Research Report > New Finding (0.48)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (1.00)

Degen, Peer-Benedikt, Asanov, Igor

Beyond Automation: Socratic AI, Epistemic Agency, and the Implications of the Emergence of Orchestrated Multi-Agent Learning Architectures

Generative AI is no longer a peripheral tool in higher education. It is rapidly evolving into a general-purpose infrastructure that reshapes how knowledge is generated, mediated, and validated. This paper presents findings from a controlled experiment evaluating a Socratic AI Tutor, a large language model designed to scaffold student research question development through structured dialogue grounded in constructivist theory. Conducted with 65 pre-service teacher students in Germany, the study compares interaction with the Socratic Tutor to engagement with an uninstructed AI chatbot. Students using the Socratic Tutor reported significantly greater support for critical, independent, and reflective thinking, suggesting that dialogic AI can stimulate metacognitive engagement and challenging recent narratives of de-skilling due to generative AI usage. These findings serve as a proof of concept for a broader pedagogical shift: the use of multi-agent systems (MAS) composed of specialised AI agents. To conceptualise this, we introduce the notion of orchestrated MAS, modular, pedagogically aligned agent constellations, curated by educators, that support diverse learning trajectories through differentiated roles and coordinated interaction. To anchor this shift, we propose an adapted offer-and-use model, in which students appropriate instructional offers from these agents. Beyond technical feasibility, we examine system-level implications for higher education institutions and students, including funding necessities, changes to faculty roles, curriculars, competencies and assessment practices. We conclude with a comparative cost-effectiveness analysis highlighting the scalability of such systems. In sum, this study contributes both empirical evidence and a conceptual roadmap for hybrid learning ecosystems that embed human-AI co-agency and pedagogical alignment.

artificial intelligence, machine learning, natural language, (18 more...)

2508.05116

Country:

North America > United States (0.28)
Europe > Germany (0.24)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study > Negative Result (0.35)

Industry:

Education > Educational Setting > Higher Education (1.00)
Education > Curriculum > Subject-Specific Education (1.00)
Education > Educational Technology > Educational Software > Computer Based Training (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.87)

JPS: Jailbreak Multimodal Large Language Models with Collaborative Visual Perturbation and Textual Steering

Chen, Renmiao, Cui, Shiyao, Huang, Xuancheng, Pan, Chengwei, Huang, Victor Shea-Jay, Zhang, QingLin, Ouyang, Xuan, Zhang, Zhexin, Wang, Hongning, Huang, Minlie

Jailbreak attacks against multimodal large language Models (MLLMs) are a significant research focus. Current research predominantly focuses on maximizing attack success rate (ASR), often overlooking whether the generated responses actually fulfill the attacker's malicious intent. This oversight frequently leads to low-quality outputs that bypass safety filters but lack substantial harmful content. To address this gap, we propose JPS, \underline{J}ailbreak MLLMs with collaborative visual \underline{P}erturbation and textual \underline{S}teering, which achieves jailbreaks via corporation of visual image and textually steering prompt. Specifically, JPS utilizes target-guided adversarial image perturbations for effective safety bypass, complemented by "steering prompt" optimized via a multi-agent system to specifically guide LLM responses fulfilling the attackers' intent. These visual and textual components undergo iterative co-optimization for enhanced performance. To evaluate the quality of attack outcomes, we propose the Malicious Intent Fulfillment Rate (MIFR) metric, assessed using a Reasoning-LLM-based evaluator. Our experiments show JPS sets a new state-of-the-art in both ASR and MIFR across various MLLMs and benchmarks, with analyses confirming its efficacy. Codes are available at \href{https://github.com/thu-coai/JPS}{https://github.com/thu-coai/JPS}. \color{warningcolor}{Warning: This paper contains potentially sensitive contents.}

artificial intelligence, large language model, natural language, (18 more...)

doi: 10.1145/3746027.3754561

2508.05087

Country:

Europe > Austria (0.28)
North America > Mexico (0.28)
North America > United States (0.28)

Genre: Research Report > New Finding (0.66)

Industry:

Information Technology > Security & Privacy (1.00)
Government (0.93)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Meta-RAG on Large Codebases Using Code Summarization

Tawosi, Vali, Alamir, Salwa, Liu, Xiaomo, Veloso, Manuela

Large Language Model (LLM) systems have been at the forefront of applied Artificial Intelligence (AI) research in a multitude of domains. One such domain is software development, where researchers have pushed the automation of a number of code tasks through LLM agents. Software development is a complex ecosystem, that stretches far beyond code implementation and well into the realm of code maintenance. In this paper, we propose a multi-agent system to localize bugs in large pre-existing codebases using information retrieval and LLMs. Our system introduces a novel Retrieval Augmented Generation (RAG) approach, Meta-RAG, where we utilize summaries to condense codebases by an average of 79.8\%, into a compact, structured, natural language representation. We then use an LLM agent to determine which parts of the codebase are critical for bug resolution, i.e. bug localization. We demonstrate the usefulness of Meta-RAG through evaluation with the SWE-bench Lite dataset. Meta-RAG scores 84.67 % and 53.0 % for file-level and function-level correct localization rates, respectively, achieving state-of-the-art performance.

large language model, machine learning, natural language, (17 more...)

2508.02611

Genre: Research Report (0.64)

Industry: Banking & Finance (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

arXiv.org Machine LearningAug-7-2025

Streaming Generated Gaussian Process Experts for Online Learning and Control

Yang, Zewen, Zhang, Dongfa, Dai, Xiaobing, Yu, Fengyi, Zhang, Chi, Huang, Bingkun, Sadeghian, Hamid, Haddadin, Sami

Gaussian Processes (GPs), as a nonparametric learning method, offer flexible modeling capabilities and calibrated uncertainty quantification for function approximations. Additionally, GPs support online learning by efficiently incorporating new data with polynomial-time computation, making them well-suited for safety-critical dynamical systems that require rapid adaptation. However, the inference and online updates of exact GPs, when processing streaming data, incur cubic computation time and quadratic storage memory complexity, limiting their scalability to large datasets in real-time settings. In this paper, we propose a streaming kernel-induced progressively generated expert framework of Gaussian processes (SkyGP) that addresses both computational and memory constraints by maintaining a bounded set of experts, while inheriting the learning performance guarantees from exact Gaussian processes. Furthermore, two SkyGP variants are introduced, each tailored to a specific objective, either maximizing prediction accuracy (SkyGP-Dense) or improving computational efficiency (SkyGP-Fast). The effectiveness of SkyGP is validated through extensive benchmarks and real-time control experiments demonstrating its superior performance compared to state-of-the-art approaches.

artificial intelligence, machine learning, modeling & simulation, (15 more...)

arXiv.org Machine Learning

2508.03679

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
Europe > France > Hauts-de-France > Nord > Lille (0.04)

Genre: Research Report (1.00)

Industry: Education > Educational Setting > Online (0.61)

Technology:

Information Technology > Modeling & Simulation (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

arXiv.org Artificial IntelligenceAug-7-2025

HyCodePolicy: Hybrid Language Controllers for Multimodal Monitoring and Decision in Embodied Agents

Liu, Yibin, Liang, Zhixuan, Chen, Zanxin, Chen, Tianxing, Hu, Mengkang, Dong, Wanxi, Xu, Congsheng, Han, Zhaoming, Qin, Yusen, Mu, Yao

Recent advances in multimodal large language models (MLLMs) have enabled richer perceptual grounding for code policy generation in embodied agents. However, most existing systems lack effective mechanisms to adaptively monitor policy execution and repair codes during task completion. In this work, we introduce HyCodePolicy, a hybrid language-based control framework that systematically integrates code synthesis, geometric grounding, perceptual monitoring, and iterative repair into a closed-loop programming cycle for embodied agents. Technically, given a natural language instruction, our system first decomposes it into subgoals and generates an initial executable program grounded in object-centric geometric primitives. The program is then executed in simulation, while a vision-language model (VLM) observes selected checkpoints to detect and localize execution failures and infer failure reasons. By fusing structured execution traces capturing program-level events with VLM-based perceptual feedback, HyCodePolicy infers failure causes and repairs programs. This hybrid dual feedback mechanism enables self-correcting program synthesis with minimal human supervision. Our results demonstrate that HyCodePolicy significantly improves the robustness and sample efficiency of robot manipulation policies, offering a scalable strategy for integrating multimodal reasoning into autonomous decision-making pipelines.

artificial intelligence, large language model, natural language, (16 more...)

2508.02629

Genre: Research Report > New Finding (0.86)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.90)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.68)
Information Technology > Artificial Intelligence > Robots > Manipulation (0.66)

arXiv.org Artificial IntelligenceAug-7-2025

Finance as Extended Biology: Reciprocity as the Cognitive Substrate of Financial Behavior

Diau, Egil

A central challenge in economics and artificial intelligence is explaining how financial behaviors-such as credit, insurance, and trade-emerge without formal institutions. We argue that these functions are not products of institutional design, but structured extensions of a single behavioral substrate: reciprocity. Far from being a derived strategy, reciprocity served as the foundational logic of early human societies-governing the circulation of goods, regulation of obligation, and maintenance of long-term cooperation well before markets, money, or formal rules. Trade, commonly regarded as the origin of financial systems, is reframed here as the canonical form of reciprocity: simultaneous, symmetric, and partner-contingent. Building on this logic, we reconstruct four core financial functions-credit, insurance, token exchange, and investment-as expressions of the same underlying principle under varying conditions. By grounding financial behavior in minimal, simulateable dynamics of reciprocal interaction, this framework shifts the focus from institutional engineering to behavioral computation-offering a new foundation for modeling decentralized financial behavior in both human and artificial agents.

artificial intelligence, machine learning, reciprocity, (18 more...)

2506.00099

Country: Asia > Taiwan (0.14)

Genre: Research Report (0.83)

Industry: Banking & Finance > Insurance (0.56)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning (0.93)

arXiv.org Artificial IntelligenceAug-7-2025

Reciprocity as the Foundational Substrate of Society: How Reciprocal Dynamics Scale into Social Systems

Diau, Egil

Prevailing accounts in both multi-agent AI and the social sciences explain social structure through top-down abstractions-such as institutions, norms, or trust-yet lack simulateable models of how such structures emerge from individual behavior. Ethnographic and archaeological evidence suggests that reciprocity served as the foundational mechanism of early human societies, enabling economic circulation, social cohesion, and interpersonal obligation long before the rise of formal institutions. Modern financial systems such as credit and currency can likewise be viewed as scalable extensions of reciprocity, formalizing exchange across time and anonymity. Building on this insight, we argue that reciprocity is not merely a local or primitive exchange heuristic, but the scalable substrate from which large-scale social structures can emerge. We propose a three-stage framework to model this emergence: reciprocal dynamics at the individual level, norm stabilization through shared expectations, and the construction of durable institutional patterns. This approach offers a cognitively minimal, behaviorally grounded foundation for simulating how large-scale social systems can emerge from decentralized reciprocal interaction.

artificial intelligence, interaction, machine learning, (17 more...)

2505.08319

Country: Asia > Taiwan (0.14)

Genre: Research Report (0.64)

Industry:

Law (0.46)
Banking & Finance (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)