Goto

Collaborating Authors

 Agents


Market-Dependent Communication in Multi-Agent Alpha Generation

arXiv.org Artificial Intelligence

Multi-strategy hedge funds face a fundamental organizational choice: should analysts generating trading strategies communicate, and if so, how? We investigate this using 5-agent LLM-based trading systems across 450 experiments spanning 21 months, comparing five organizational structures from isolated baseline to collaborative and competitive conversation. We show that communication improves performance, but optimal communication design depends on market characteristics. Competitive conversation excels in volatile technology stocks, while collaborative conversation dominates stable general stocks. Finance stocks resist all communication interventions. Surprisingly, all structures, including isolated agents, converge to similar strategy alignments, challenging assumptions that transparency causes harmful diversity loss. Performance differences stem from behavioral mechanisms: competitive agents focus on stock-level allocation while collaborative agents develop technical frameworks. Conversation quality scores show zero correlation with returns. These findings demonstrate that optimal communication design must match market volatility characteristics, and sophisticated discussions don't guarantee better performance.


Multi-Agent Multimodal Large Language Model Framework for Automated Interpretation of Fuel Efficiency Analytics in Public Transportation

arXiv.org Artificial Intelligence

Enhancing fuel efficiency in public transportation requires the integration of complex multimodal data into interpretable, decision-relevant insights. However, traditional analytics and visualization methods often yield fragmented outputs that demand extensive human interpretation, limiting scalability and consistency. This study presents a multi-agent framework that leverages multimodal large language models (LLMs) to automate data narration and energy insight generation. The framework coordinates three specialized agents, including a data narration agent, an LLM-as-a-judge agent, and an optional human-in-the-loop evaluator, to iteratively transform analytical artifacts into coherent, stakeholder-oriented reports. The system is validated through a real-world case study on public bus transportation in Northern Jutland, Denmark, where fuel efficiency data from 4006 trips are analyzed using Gaussian Mixture Model clustering. Comparative experiments across five state-of-the-art LLMs and three prompting paradigms identify GPT-4.1 mini with Chain-of-Thought prompting as the optimal configuration, achieving 97.3% narrative accuracy while balancing interpretability and computational cost. The findings demonstrate that multi-agent orchestration significantly enhances factual precision, coherence, and scalability in LLM-based reporting. The proposed framework establishes a replicable and domain-adaptive methodology for AI-driven narrative generation and decision support in energy informatics.


Collective decision-making with higher-order interactions on $d$-uniform hypergraphs

arXiv.org Artificial Intelligence

Understanding how group interactions influence opinion dynamics is fundamental to the study of collective behavior. In this work, we propose and study a model of opinion dynamics on $d$-uniform hypergraphs, where individuals interact through group-based (higher-order) structures rather than simple pairwise connections. Each one of the two opinions $A$ and $B$ is characterized by a quality, $Q_A$ and $Q_B$, and agents update their opinions according to a general mechanism that takes into account the weighted fraction of agents supporting either opinion and the pooling error, $α$, a proxy for the information lost during the interaction. Through bifurcation analysis of the mean-field model, we identify two critical thresholds, $α_{\text{crit}}^{(1)}$ and $α_{\text{crit}}^{(2)}$, which delimit stability regimes for the consensus states. These analytical predictions are validated through extensive agent-based simulations on both random and scale-free hypergraphs. Moreover, the analytical framework demonstrates that the bifurcation structure and critical thresholds are independent of the underlying topology of the higher-order network, depending solely on the parameters $d$, i.e., the size of the interaction groups, and the quality ratio. Finally, we bring to the fore a nontrivial effect: the large sizes of the interaction groups, could drive the system toward the adoption of the worst option.


Glia: A Human-Inspired AI for Automated Systems Design and Optimization

arXiv.org Artificial Intelligence

Can an AI autonomously design mechanisms for computer systems on par with the creativity and reasoning of human experts? We present Glia, an AI architecture for networked systems design that uses large language models (LLMs) in a human-inspired, multi-agent workflow. Each agent specializes in reasoning, experimentation, and analysis, collaborating through an evaluation framework that grounds abstract reasoning in empirical feedback. Unlike prior ML-for-systems methods that optimize black-box policies, Glia generates interpretable designs and exposes its reasoning process. When applied to a distributed GPU cluster for LLM inference, it produces new algorithms for request routing, scheduling, and auto-scaling that perform at human-expert levels in significantly less time, while yielding novel insights into workload behavior. Our results suggest that by combining reasoning LLMs with structured experimentation, an AI can produce creative and understandable designs for complex systems problems.


An Operational Kardashev-Style Scale for Autonomous AI - Towards AGI and Superintelligence

arXiv.org Artificial Intelligence

We propose a Kardashev-inspired yet operational Autonomous AI (AAI) Scale that measures the progression from fixed robotic process automation (AAI-0) to full artificial general intelligence (AAI-4) and beyond. Unlike narrative ladders, our scale is multi-axis and testable. We define ten capability axes (Autonomy, Generality, Planning, Memory/Persistence, Tool Economy, Self-Revision, Sociality/Coordination, Embodiment, World-Model Fidelity, Economic Throughput) aggregated by a composite AAI-Index (a weighted geometric mean). We introduce a measurable Self-Improvement Coefficient $κ$ (capability growth per unit of agent-initiated resources) and two closure properties (maintenance and expansion) that convert ``self-improving AI'' into falsifiable criteria. We specify OWA-Bench, an open-world agency benchmark suite that evaluates long-horizon, tool-using, persistent agents. We define level gates for AAI-0\ldots AAI-4 using thresholds on the axes, $κ$, and closure proofs. Synthetic experiments illustrate how present-day systems map onto the scale and how the delegability frontier (quality vs.\ autonomy) advances with self-improvement. We also prove a theorem that AAI-3 agent becomes AAI-5 over time with sufficient conditions, formalizing "baby AGI" becomes Superintelligence intuition.


MedDCR: Learning to Design Agentic Workflows for Medical Coding

arXiv.org Artificial Intelligence

Medical coding converts free-text clinical notes into standardized diagnostic and procedural codes, which are essential for billing, hospital operations, and medical research. Unlike ordinary text classification, it requires multi-step reasoning: extracting diagnostic concepts, applying guideline constraints, mapping to hierarchical codebooks, and ensuring cross-document consistency. Recent advances leverage agentic LLMs, but most rely on rigid, manually crafted workflows that fail to capture the nuance and variability of real-world documentation, leaving open the question of how to systematically learn effective workflows. We present MedDCR, a closed-loop framework that treats workflow design as a learning problem. A Designer proposes workflows, a Coder executes them, and a Reflector evaluates predictions and provides constructive feedback, while a memory archive preserves prior designs for reuse and iterative refinement. On benchmark datasets, MedDCR outperforms state-of-the-art baselines and produces interpretable, adaptable workflows that better reflect real coding practice, improving both the reliability and trustworthiness of automated systems.


Grounded by Experience: Generative Healthcare Prediction Augmented with Hierarchical Agentic Retrieval

arXiv.org Artificial Intelligence

Accurate healthcare prediction is critical for improving patient outcomes and reducing operational costs. Bolstered by growing reasoning capabilities, large language models (LLMs) offer a promising path to enhance healthcare predictions by drawing on their rich parametric knowledge. However, LLMs are prone to factual inaccuracies due to limitations in the reliability and coverage of their embedded knowledge. While retrieval-augmented generation (RAG) frameworks, such as GraphRAG and its variants, have been proposed to mitigate these issues by incorporating external knowledge, they face two key challenges in the healthcare scenario: (1) identifying the clinical necessity to activate the retrieval mechanism, and (2) achieving synergy between the retriever and the generator to craft contextually appropriate retrievals. To address these challenges, we propose GHAR, a \underline{g}enerative \underline{h}ierarchical \underline{a}gentic \underline{R}AG framework that simultaneously resolves when to retrieve and how to optimize the collaboration between submodules in healthcare. Specifically, for the first challenge, we design a dual-agent architecture comprising Agent-Top and Agent-Low. Agent-Top acts as the primary physician, iteratively deciding whether to rely on parametric knowledge or to initiate retrieval, while Agent-Low acts as the consulting service, summarising all task-relevant knowledge once retrieval was triggered. To tackle the second challenge, we innovatively unify the optimization of both agents within a formal Markov Decision Process, designing diverse rewards to align their shared goal of accurate prediction while preserving their distinct roles. Extensive experiments on three benchmark datasets across three popular tasks demonstrate our superiority over state-of-the-art baselines, highlighting the potential of hierarchical agentic RAG in advancing healthcare systems.


LLM-based Multi-Agent System for Simulating Strategic and Goal-Oriented Data Marketplaces

arXiv.org Artificial Intelligence

Abstract--Data marketplaces, which mediate the purchase and exchange of data from third parties, have attracted growing attention for reducing the cost and effort of data collection while enabling the trading of diverse datasets. However, a systematic understanding of the interactions between market participants, data, and regulations remains limited. T o address this gap, we propose a Large Language Model-based Multi-Agent System (LLM-MAS) for data marketplaces. In our framework, buyer and seller agents powered by LLMs operate with explicit objectives and autonomously perform strategic actions, such as--planning, searching, purchasing, pricing, and updating data. These agents can reason about market dynamics, forecast future demand, and adapt their strategies accordingly. Unlike conventional model-based simulations, which are typically constrained to predefined rules, LLM-MAS supports broader and more adaptive behavior selection through natural language reasoning. We evaluated the framework via simulation experiments using three distribution-based metrics: (1) the number of purchases per dataset, (2) the number of purchases per buyer, and (3) the number of repeated purchases of the same dataset. The results demonstrate that LLM-MAS more faithfully reproduces trading patterns observed in real data marketplaces compared to traditional approaches, and further captures the emergence and evolution of market trends. Data have emerged as a tradable economic resource, and data marketplaces that mediate the purchase and exchange of datasets from third parties have rapidly expanded [1]. These marketplaces streamline data collection that previously required substantial cost and effort, while also providing organizations and researchers with access to diverse, high-quality datasets. As a result, they are increasingly recognized as critical infrastructures that accelerate innovation based on data that were closed within individual organizations [2]. Despite this progress, our understanding of how interactions among market participants, data, and regulations shape market dynamics remains limited. Smooth and efficient data transactions require well-designed and robust data marketplaces [3].


Informative Communication of Robot Plans

arXiv.org Artificial Intelligence

When a robot is asked to verbalize its plan it can do it in many ways. For example, a seemingly natural strategy is incremental, where the robot verbalizes its planned actions in plan order. However, an important aspect of this type of strategy is that it misses considerations on what is effectively informative to communicate, because not considering what the user knows prior to explanations. In this paper we propose a verbalization strategy to communicate robot plans informatively, by measuring the information gain that verbalizations have against a second-order theory of mind of the user capturing his prior knowledge on the robot. As shown in our experiments, this strategy allows to understand the robot's goal much quicker than by using strategies such as increasing or decreasing plan order. In addition, following our formulation we hint to what is informative and why when a robot communicates its plan.


Agent-Oriented Visual Programming for the Web of Things

arXiv.org Artificial Intelligence

In this paper we introduce and discuss an approach for multi-agent-oriented visual programming. This aims at enabling individuals without programming experience but with knowledge in specific target domains to design and (re)configure autonomous software. We argue that, compared to procedural programming, it should be simpler for users to create programs when agent abstractions are employed. The underlying rationale is that these abstractions, and specifically the belief-desire-intention architecture that is aligned with human practical reasoning, match more closely with people's everyday experience in interacting with other agents and artifacts in the real world. On top of this, we designed and implemented a visual programming system for agents that hides the technicalities of agent-oriented programming using a blocks-based visual development environment that is built on the JaCaMo platform. To further validate the proposed solution, we integrate the Web of Things (WoT) to let users create autonomous behaviour on top of physical mashups of devices, following the trends in industrial end-user programming. Finally, we report on a pilot user study where we verified that novice users are indeed able to make use of this development environment to create multi-agent systems to solve simple automation tasks.