AITopics

Language-driven grasp detection has the potential to revolutionize human-robot interaction by allowing robots to understand and execute grasping tasks based on natural language commands. However, existing approaches face two key challenges. First, they often struggle to interpret complex text instructions or operate ineffectively in densely cluttered environments. Second, most methods require a training or finetuning step to adapt to new domains, limiting their generation in real-world applications. In this paper, we introduce GraspMAS, a new multi-agent system framework for language-driven grasp detection. GraspMAS is designed to reason through ambiguities and improve decision-making in real-world scenarios. Our framework consists of three specialized agents: Planner, responsible for strategizing complex queries; Coder, which generates and executes source code; and Observer, which evaluates the outcomes and provides feedback. Intensive experiments on two large-scale datasets demonstrate that our GraspMAS significantly outperforms existing baselines. Additionally, robot experiments conducted in both simulation and real-world settings further validate the effectiveness of our approach. Our project page is available at https://zquang2202.github.io/GraspMAS

artificial intelligence, grasp detection, machine learning, (19 more...)

2506.18448

Country: Asia > Japan (0.28)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

SciSage: A Multi-Agent Framework for High-Quality Scientific Survey Generation

Shi, Xiaofeng, Kou, Qian, Li, Yuduo, Tang, Ning, Xie, Jinxin, Yu, Longbin, Wang, Songjing, Zhou, Hua

The rapid growth of scientific literature demands robust tools for automated survey-generation. However, current large language model (LLM)-based methods often lack in-depth analysis, structural coherence, and reliable citations. To address these limitations, we introduce SciSage, a multi-agent framework employing a reflect-when-you-write paradigm. SciSage features a hierarchical Reflector agent that critically evaluates drafts at outline, section, and document levels, collaborating with specialized agents for query interpretation, content retrieval, and refinement. We also release SurveyScope, a rigorously curated benchmark of 46 high-impact papers (2020-2025) across 11 computer science domains, with strict recency and citation-based quality controls. Evaluations demonstrate that SciSage outperforms state-of-the-art baselines (LLM x MapReduce-V2, AutoSurvey), achieving +1.73 points in document coherence and +32% in citation F1 scores. Human evaluations reveal mixed outcomes (3 wins vs. 7 losses against human-written surveys), but highlight SciSage's strengths in topical breadth and retrieval efficiency. Overall, SciSage offers a promising foundation for research-assistive writing tools.

arxiv preprint arxiv, large language model, machine learning, (19 more...)

2506.12689

Genre:

Overview (1.00)
Research Report > New Finding (0.93)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Fatima, Syeda Kisaa, Zubair, Tehreem, Ahmed, Noman, Khan, Asifullah

AutoGen Driven Multi Agent Framework for Iterative Crime Data Analysis and Prediction

Figure 4: P lot over 100 epochs with 3 - Agents F. Ablation Study - Impact of the LearningOptimizerAgent To quantify the OptimizerAgent's effect on the system, we conducted an ablation study that set up two different configurations. Baseline (3 - Agent Framework): CrimeAnalysisAssistant, FeedbackAgent, and CrimePredictorAgent. Extended (4 - Agent Framework): All of the above, with the OptimizerAgent that could oversee and control how the other agents worked. Both settings were tested using the same protocol, working with the same data for 100 epochs and evaluated according to the already mentioned metrics described in Section V - B. Importantly, during the extended framework tests the OptimizerAgent did not have access to the ground truth and its actions reflected those of a real - world supervisor trying to be efficient with resources . The main aim was to bring more stability and better learning curve using our framework LUCID - MA. Table 2: 4 - Aegnts Observed Improvement Metric Baseline (3 agents) With OptimizerAgent Improvement CrimeAnalysis Assistant Final Score 0.94 0.96 +0.02 FeedbackAgent Final Score 0.89 0.92 +0.03 CrimePredictorAgent Final Score 0.85 0.91 +0.06 Avg. Redundancy Across Epochs 14.2% 6.8% - 7.4% Using the OptimizerAgent resulted in a marked increase in the variety and quality of final system outputs . Visual Result: The final plot demonstrates that agent - level meta - control, As a result, the model exhibits higher consistency, greater variety in its results and more reliable improvement over time -- all accomplished without any need for further model fine - tuning. Figure 5: P lot over 100 epochs with 4 - Agents In addition to standard performance comparison metrics, our system portrayed advanced behavioral dynamics pointing to the pre sence of emergent intelligence capabilities which we delve into in the next section in great detail.

artificial intelligence, deep learning, machine learning, (16 more...)

2506.11475

Country:

North America > United States (0.47)
Europe > United Kingdom > England (0.46)

Genre: Research Report (0.82)

Industry:

Law Enforcement & Public Safety > Crime Prevention & Enforcement (0.68)
Information Technology > Security & Privacy (0.46)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Specification and Evaluation of Multi-Agent LLM Systems -- Prototype and Cybersecurity Applications

Härer, Felix

Recent advancements in LLMs indicate potential for novel applications, as evidenced by the reasoning capabilities in the latest OpenAI and DeepSeek models. To apply these models to domain-specific applications beyond text generation, LLM-based multi-agent systems can be utilized to solve complex tasks, particularly by combining reasoning techniques, code generation, and software execution across multiple, potentially specialized LLMs. However, while many evaluations are performed on LLMs, reasoning techniques, and applications individually, their joint specification and combined application are not well understood. Defined specifications for multi-agent LLM systems are required to explore their potential and suitability for specific applications, allowing for systematic evaluations of LLMs, reasoning techniques, and related aspects. This paper reports the results of exploratory research on (1.) multi-agent specification by introducing an agent schema language and (2.) the execution and evaluation of the specifications through a multi-agent system architecture and prototype. The specification language, system architecture, and prototype are first presented in this work, building on an LLM system from prior research. Test cases involving cybersecurity tasks indicate the feasibility of the architecture and evaluation approach. As a result, evaluations could be demonstrated for question answering, server security, and network security tasks completed correctly by agents with LLMs from OpenAI and DeepSeek.

large language model, machine learning, natural language, (17 more...)

2506.10467

Country:

North America > United States (0.14)
Europe > Switzerland (0.14)

Genre:

Research Report (1.00)
Overview (1.00)

Industry:

Information Technology > Security & Privacy (1.00)
Government > Military > Cyberwarfare (0.63)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.88)

From Mind to Machine: The Rise of Manus AI as a Fully Autonomous Digital Agent

Shen, Minjie, Li, Yanshu, Chen, Lulu, Yang, Qikai

Manus AI is a general-purpose AI agent introduced in early 2025, marking a significant advancement in autonomous artificial intelligence. Developed by the Chinese startup Monica.im, Manus is designed to bridge the gap between "mind" and "hand" - combining the reasoning and planning capabilities of large language models with the ability to execute complex, end-to-end tasks that produce tangible outcomes. This paper presents a comprehensive overview of Manus AI, exploring its core technical architecture, diverse applications across sectors such as healthcare, finance, manufacturing, robotics, and gaming, as well as its key strengths, current limitations, and future potential. Positioned as a preview of what lies ahead, Manus AI represents a shift toward intelligent agents that can translate high-level intentions into real-world actions, heralding a new era of human-AI collaboration.

large language model, machine learning, real time system, (20 more...)

2505.02024

Country: North America > United States (1.00)

Genre:

Research Report (1.00)
Overview (1.00)

Industry:

Media (1.00)
Law (1.00)
Information Technology > Security & Privacy (1.00)
(7 more...)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
(5 more...)

A Vision for Auto Research with LLM Agents

Liu, Chengwei, Wang, Chong, Cao, Jiayue, Ge, Jingquan, Wang, Kun, Zhang, Lyuye, Cheng, Ming-Ming, Zhao, Penghai, Li, Tianlin, Jia, Xiaojun, Li, Xiang, Li, Xingshuai, Liu, Yang, Feng, Yebo, Huang, Yihao, Xu, Yijia, Sun, Yuqiang, Zhou, Zhenhong, Xu, Zhengzi

This paper introduces Agent-Based Auto Research, a structured multi-agent framework designed to automate, coordinate, and optimize the full lifecycle of scientific research. Leveraging the capabilities of large language models (LLMs) and modular agent collaboration, the system spans all major research phases, including literature review, ideation, methodology planning, experimentation, paper writing, peer review response, and dissemination. By addressing issues such as fragmented workflows, uneven methodological expertise, and cognitive overload, the framework offers a systematic and scalable approach to scientific inquiry. Preliminary explorations demonstrate the feasibility and potential of Auto Research as a promising paradigm for self-improving, AI-driven research processes.

large language model, machine learning, natural language, (19 more...)

2504.18765

Genre:

Workflow (1.00)
Research Report > Promising Solution (1.00)
Overview (1.00)
Research Report > New Finding (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

APIGen-MT: Agentic Pipeline for Multi-Turn Data Generation via Simulated Agent-Human Interplay

Prabhakar, Akshara, Liu, Zuxin, Zhu, Ming, Zhang, Jianguo, Awalgaonkar, Tulika, Wang, Shiyu, Liu, Zhiwei, Chen, Haolin, Hoang, Thai, Niebles, Juan Carlos, Heinecke, Shelby, Yao, Weiran, Wang, Huan, Savarese, Silvio, Xiong, Caiming

Training effective AI agents for multi-turn interactions requires high-quality data that captures realistic human-agent dynamics, yet such data is scarce and expensive to collect manually. We introduce APIGen-MT, a two-phase framework that generates verifiable and diverse multi-turn agent data. In the first phase, our agentic pipeline produces detailed task blueprints with ground-truth actions, leveraging a committee of LLM reviewers and iterative feedback loops. These blueprints are then transformed into complete interaction trajectories through simulated human-agent interplay. We train a family of models -- the xLAM-2-fc-r series with sizes ranging from 1B to 70B parameters. Our models outperform frontier models such as GPT-4o and Claude 3.5 on $τ$-bench and BFCL benchmarks, with the smaller models surpassing their larger counterparts, particularly in multi-turn settings, while maintaining superior consistency across multiple trials. Comprehensive experiments demonstrate that our verified blueprint-to-details approach yields high-quality training data, enabling the development of more reliable, efficient, and capable agents. We open-source 5K synthetic data trajectories and the trained xLAM-2-fc-r models to advance research in AI agents. Models at https://huggingface.co/collections/Salesforce/xlam-2-67ef5be12949d8dcdae354c4; Dataset at https://huggingface.co/datasets/Salesforce/APIGen-MT-5k and Website at https://apigen-mt.github.io

arxiv preprint arxiv, large language model, machine learning, (20 more...)

2504.03601

Genre:

Research Report (0.50)
Workflow (0.46)

Industry: Information Technology (0.86)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
(2 more...)

arXiv.org Artificial IntelligenceJul-21-2025

Preference-based Multi-Objective Reinforcement Learning

Mu, Ni, Luan, Yao, Jia, Qing-Shan

Multi-objective reinforcement learning (MORL) is a structured approach for optimizing tasks with multiple objectives. However, it often relies on pre-defined reward functions, which can be hard to design for balancing conflicting goals and may lead to oversimplification. Preferences can serve as more flexible and intuitive decision-making guidance, eliminating the need for complicated reward design. This paper introduces preference-based MORL (Pb-MORL), which formalizes the integration of preferences into the MORL framework. We theoretically prove that preferences can derive policies across the entire Pareto frontier. To guide policy optimization using preferences, our method constructs a multi-objective reward model that aligns with the given preferences. We further provide theoretical proof to show that optimizing this reward model is equivalent to training the Pareto optimal policy. Extensive experiments in benchmark multi-objective tasks, a multi-energy management task, and an autonomous driving task on a multi-line highway show that our method performs competitively, surpassing the oracle method, which uses the ground truth reward function. This highlights its potential for practical applications in complex real-world systems.

artificial intelligence, machine learning, reinforcement learning, (16 more...)

doi: 10.1109/TASE.2025.3589271

2507.14066

Country: Asia > China (0.14)

Genre: Research Report (1.00)

Industry:

Automobiles & Trucks (0.88)
Energy > Power Industry (0.48)
Transportation > Ground > Road (0.34)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
(2 more...)

arXiv.org Artificial IntelligenceJul-21-2025

Byzantine-resilient federated online learning for Gaussian process regression

Zhang, Xu, Yuan, Zhenyuan, Zhu, Minghui

In this paper, we study Byzantine-resilient federated online learning for Gaussian process regression (GPR). We develop a Byzantine-resilient federated GPR algorithm that allows a cloud and a group of agents to collaboratively learn a latent function and improve the learning performances where some agents exhibit Byzantine failures, i.e., arbitrary and potentially adversarial behavior. Each agent-based local GPR sends potentially compromised local predictions to the cloud, and the cloud-based aggregated GPR computes a global model by a Byzantine-resilient product of experts aggregation rule. Then the cloud broadcasts the current global model to all the agents. Agent-based fused GPR refines local predictions by fusing the received global model with that of the agent-based local GPR. Moreover, we quantify the learning accuracy improvements of the agent-based fused GPR over the agent-based local GPR. Experiments on a toy example and two medium-scale real-world datasets are conducted to demonstrate the performances of the proposed algorithm.

agent, artificial intelligence, gpr, (15 more...)

2507.14021

Country: North America > United States (1.00)

Genre: Research Report (0.50)

Industry:

Information Technology > Security & Privacy (1.00)
Education > Educational Setting > Online (0.70)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)

de Macedo, Maria Eduarda Silva, de Souza, Ana Paula Chiarelli, Rosso, Roberto Silvio Ubertino Jr., Lopes, Yuri Kaszubowski

A Minimalist Controller for Autonomously Self-Aggregating Robotic Swarms: Enabling Compact Formations in Multitasking Scenarios

arXiv.org Artificial IntelligenceJul-21-2025

The deployment of simple emergent behaviors in swarm robotics has been well-rehearsed in the literature. A recent study has shown how self-aggregation is possible in a multitask approach -- where multiple self-aggregation task instances occur concurrently in the same environment. The multitask approach poses new challenges, in special, how the dynamic of each group impacts the performance of others. So far, the multitask self-aggregation of groups of robots suffers from generating a circular formation -- that is not fully compact -- or is not fully autonomous. In this paper, we present a multitask self-aggregation where groups of homogeneous robots sort themselves into different compact clusters, relying solely on a line-of-sight sensor. Our multitask self-aggregation behavior was able to scale well and achieve a compact formation. We report scalability results from a series of simulation trials with different configurations in the number of groups and the number of robots per group. We were able to improve the multitask self-aggregation behavior performance in terms of the compactness of the clusters, keeping the proportion of clustered robots found in other studies.

artificial intelligence, bollard, robot, (15 more...)

2507.13969

Country: North America > United States (0.28)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.91)