Goto

Collaborating Authors

 Agents


An Auditable Agent Platform For Automated Molecular Optimisation

arXiv.org Artificial Intelligence

Drug discovery frequently loses momentum when data, expertise, and tools are scattered, slowing design cycles. To shorten this loop we built a hierarchical, tool using agent framework that automates molecular optimisation. A Principal Researcher defines each objective, a Database agent retrieves target information, an AI Expert generates de novo scaffolds with a sequence to molecule deep learning model, a Medicinal Chemist edits them while invoking a docking tool, a Ranking agent scores the candidates, and a Scientific Critic polices the logic. Each tool call is summarised and stored causing the full reasoning path to remain inspectable. The agents communicate through concise provenance records that capture molecular lineage, to build auditable, molecule centered reasoning trajectories and reuse successful transformations via in context learning. Three cycle research loops were run against AKT1 protein using five large language models. After ranking the models by mean docking score, we ran 20 independent scale ups on the two top performers. We then compared the leading LLMs' binding affinity results across three configurations, LLM only, single agent, and multi agent. Our results reveal an architectural trade off, the multi agent setting excelled at focused binding optimization, improving average predicted binding affinity by 31%. In contrast, single agent runs generated molecules with superior drug like properties at the cost of less potent binding scores. Unguided LLM runs finished fastest, yet their lack of transparent tool signals left the validity of their reasoning paths unverified. These results show that test time scaling, focused feedback loops and provenance convert general purpose LLMs into auditable systems for molecular design, and suggest that extending the toolset to ADMET and selectivity predictors could push research workflows further along the discovery pipeline.


Agentic AI in 6G Software Businesses: A Layered Maturity Model

arXiv.org Artificial Intelligence

The emergence of agentic AI systems in 6G software businesses presents both strategic opportunities and significant challenges. While such systems promise increased autonomy, scalability, and intelligent decision-making across distributed environments, their adoption raises concerns regarding technical immaturity, integration complexity, organizational readiness, and performance-cost trade-offs. In this study, we conducted a preliminary thematic mapping to identify factors influencing the adoption of agentic software within the context of 6G. Drawing on a multivocal literature review and targeted scanning, we identified 29 motivators and 27 demotivators, which were further categorized into five high-level themes in each group. This thematic mapping offers a structured overview of the enabling and inhibiting forces shaping organizational readiness for agentic transformation. Positioned as a feasibility assessment, the study represents an early phase of a broader research initiative aimed at developing and validating a layered maturity model grounded in CMMI model with the software architectural three dimensions possibly Data, Business Logic, and Presentation. Ultimately, this work seeks to provide a practical framework to help software-driven organizations assess, structure, and advance their agent-first capabilities in alignment with the demands of 6G.


A Closed-Loop Multi-Agent Framework for Aerodynamics-Aware Automotive Styling Design

arXiv.org Artificial Intelligence

The core challenge in automotive exterior design is balancing subjective aesthetics with objective aerodynamic performance while dramatically accelerating the development cycle. To address this, we propose a novel, LLM-driven multi-agent framework that automates the end-to-end workflow from ambiguous requirements to 3D concept model performance validation. The workflow is structured in two stages: conceptual generation and performance validation. In the first stage, agents collaborate to interpret fuzzy design requirements, generate concept sketches, and produce photorealistic renderings using diffusion models. In the second stage, the renderings are converted to 3D point clouds, where a Drag Prediction Agent, built upon a lightweight surrogate model, provides near-instantaneous predictions of the drag coefficient and pressure fields, replacing time-consuming CFD simulations. The primary contribution of this work is the seamless integration of creative generation with a rapid engineering validation loop within a unified, automated system, which provides a new paradigm for efficiently balancing creative exploration with engineering constraints in the earliest stages of design.


Enhancing Joint Human-AI Inference in Robot Missions: A Confidence-Based Approach

arXiv.org Artificial Intelligence

Joint human-AI inference holds immense potential to improve outcomes in human-supervised robot missions. Current day missions are generally in the AI-assisted setting, where the human operator makes the final inference based on the AI recommendation. However, due to failures in human judgement on when to accept or reject the AI recommendation, complementarity is rarely achieved. We investigate joint human-AI inference where the inference made with higher confidence is selected. Through a user study with N = 100 participants on a representative simulated robot teleoperation task, specifically studying the inference of robots' control delays we show that: a) Joint inference accuracy is higher and its extent is regulated by the confidence calibration of the AI agent, and b) Humans change their inferences based on AI recommendations and the extent and direction of this change is also regulated by the confidence calibration of the AI agent. Interestingly, our results show that pairing poorly-calibrated AI-DSS with humans hurts performance instead of helping the team, reiterating the need for AI-based decision support systems with good metacognitive sensitivity. To the best of our knowledge, our study presents the first application of a maximum-confidence-based heuristic for joint human-AI inference within a simulated robot teleoperation task.


Strategic Hypothesis Testing

arXiv.org Artificial Intelligence

We examine hypothesis testing within a principal-agent framework, where a strategic agent, holding private beliefs about the effectiveness of a product, submits data to a principal who decides on approval. The principal employs a hypothesis testing rule, aiming to pick a p-value threshold that balances false positives and false negatives while anticipating the agent's incentive to maximize expected profitability. Building on prior work, we develop a game-theoretic model that captures how the agent's participation and reporting behavior respond to the principal's statistical decision rule. Despite the complexity of the interaction, we show that the principal's errors exhibit clear monotonic behavior when segmented by an efficiently computable critical p-value threshold, leading to an interpretable characterization of their optimal p-value threshold. We empirically validate our model and these insights using publicly available data on drug approvals. Overall, our work offers a comprehensive perspective on strategic interactions within the hypothesis testing framework, providing technical and regulatory insights.


Approximate Proportionality in Online Fair Division

arXiv.org Artificial Intelligence

We study the online fair division problem, where indivisible goods arrive sequentially and must be allocated immediately and irrevocably to agents. Prior work has established strong impossibility results for approximating classic fairness notions, such as envy-freeness and maximin share fairness, in this setting. In contrast, we focus on proportionality up to one good (PROP1), a natural relaxation of proportionality whose approximability remains unresolved. We begin by showing that three natural greedy algorithms fail to guarantee any positive approximation to PROP1 in general, against an adaptive adversary. This is surprising because greedy algorithms are commonly used in fair division and a natural greedy algorithm is known to be able to achieve PROP1 under additional information assumptions. This hardness result motivates the study of non-adaptive adversaries and the use of side-information, in the spirit of learning-augmented algorithms. For non-adaptive adversaries, we show that the simple uniformly random allocation can achieve a meaningful PROP1 approximation with high probability. Meanwhile, we present an algorithm that obtain robust approximation ratios against PROP1 when given predictions of the maximum item value (MIV). Interestingly, we also show that stronger fairness notions such as EF1, MMS, and PROPX remain inapproximable even with perfect MIV predictions.


Navigation Pixie: Implementation and Empirical Study Toward On-demand Navigation Agents in Commercial Metaverse

arXiv.org Artificial Intelligence

While commercial metaverse platforms offer diverse user-generated content, they lack effective navigation assistance that can dynamically adapt to users' interests and intentions. Although previous research has investigated on-demand agents in controlled environments, implementation in commercial settings with diverse world configurations and platform constraints remains challenging. We present Navigation Pixie, an on-demand navigation agent employing a loosely coupled architecture that integrates structured spatial metadata with LLM-based natural language processing while minimizing platform dependencies, which enables experiments on the extensive user base of commercial metaverse platforms. Our cross-platform experiments on commercial metaverse platform Cluster with 99 PC client and 94 VR-HMD participants demonstrated that Navigation Pixie significantly increased dwell time and free exploration compared to fixed-route and no-agent conditions across both platforms. Subjective evaluations revealed consistent on-demand preferences in PC environments versus context-dependent social perception advantages in VR-HMD. This research contributes to advancing VR interaction design through conversational spatial navigation agents, establishes cross-platform evaluation methodologies revealing environment-dependent effectiveness, and demonstrates empirical experimentation frameworks for commercial metaverse platforms.


Long Story Generation via Knowledge Graph and Literary Theory

arXiv.org Artificial Intelligence

The generation of a long story consisting of several thousand words is a sub-task in the field of long text generation~(LTG). Previous research has addressed this challenge through outline-based generation, which employs a multi-stage method for generating outlines into stories. However, this approach suffers from two common issues: almost inevitable theme drift caused by the loss of memory of previous outlines, and tedious plots with incoherent logic that are less appealing to human readers. In this paper, we propose the multi-agent Story Generator structure to improve the multi-stage method, using large language models~(LLMs) as the core components of agents. To avoid theme drift, we introduce a memory storage model comprising two components: a long-term memory storage that identifies the most important memories, thereby preventing theme drift; and a short-term memory storage that retains the latest outlines from each generation round. To incorporate engaging elements into the story, we design a story theme obstacle framework based on literary narratology theory that introduces uncertain factors and evaluation criteria to generate outline. This framework calculates the similarity of the former storyline and enhances the appeal of the story by building a knowledge graph and integrating new node content. Additionally, we establish a multi-agent interaction stage to simulate writer-reader interaction through dialogue and revise the story text according to feedback, to ensure it remains consistent and logical. Evaluations against previous methods demonstrate that our approach can generate higher-quality long stories.


Attack the Messages, Not the Agents: A Multi-round Adaptive Stealthy Tampering Framework for LLM-MAS

arXiv.org Artificial Intelligence

Large language model-based multi-agent systems (LLM-MAS) effectively accomplish complex and dynamic tasks through inter-agent communication, but this reliance introduces substantial safety vulnerabilities. Existing attack methods targeting LLM-MAS either compromise agent internals or rely on direct and overt persuasion, which limit their effectiveness, adaptability, and stealthiness. In this paper, we propose MAST, a Multi-round Adaptive Stealthy Tampering framework designed to exploit communication vulnerabilities within the system. MAST integrates Monte Carlo Tree Search with Direct Preference Optimization to train an attack policy model that adaptively generates effective multi-round tampering strategies. Furthermore, to preserve stealthiness, we impose dual semantic and embedding similarity constraints during the tampering process. Comprehensive experiments across diverse tasks, communication architectures, and LLMs demonstrate that MAST consistently achieves high attack success rates while significantly enhancing stealthiness compared to baselines.


NANDA Adaptive Resolver: Architecture for Dynamic Resolution of AI Agent Names

arXiv.org Artificial Intelligence

AdaptiveResolver is a dynamic microservice architecture designed to address the limitations of static endpoint resolution for AI agent communication in distributed, heterogeneous environments. Unlike traditional DNS or static URLs, AdaptiveResolver enables context-aware, real-time selection of communication endpoints based on factors such as geographic location, system load, agent capabilities, and security threats. Agents advertise their Agent Name and context requirements through Agent Fact cards in an Agent Registry/Index. A requesting Agent discovers a Target Agent using the registry. The Requester Agent can then resolve the Target Agent Name to obtain a tailored communication channel to the agent based on actual environmental context between the agents. The architecture supports negotiation of trust, quality of service, and resource constraints, facilitating flexible, secure, and scalable agent-to-agent interactions that go beyond the classic client-server model. AdaptiveResolver provides a foundation for robust, future-proof agent communication that can evolve with increasing ecosystem complexity.