Goto

Collaborating Authors

 Agents


An Adaptive Multi Agent Bitcoin Trading System

arXiv.org Artificial Intelligence

This paper presents a Multi Agent Bitcoin Trading system that utilizes Large Language Models (LLMs) for alpha generation and portfolio management in the cryptocurrencies market. Unlike equities, cryptocurrencies exhibit extreme volatility and are heavily influenced by rapidly shifting market sentiments and regulatory announcements, making them difficult to model using static regression models or neural networks trained solely on historical data. The proposed framework overcomes this by structuring LLMs into specialised agents for technical analysis, sentiment evaluation, decision-making, and performance reflection. The agents improve over time via a novel verbal feedback mechanism where a Reflect agent provides daily and weekly natural-language critiques of trading decisions. These textual evaluations are then injected into future prompts of the agents, allowing them to adjust allocation logic without weight updates or finetuning. Back-testing on Bitcoin price data from July 2024 to April 2025 shows consistent outperformance across market regimes: the Quantitative agent delivered over 30\% higher returns in bullish phases and 15\% overall gains versus buy-and-hold, while the sentiment-driven agent turned sideways markets from a small loss into a gain of over 100\%. Adding weekly feedback further improved total performance by 31\% and reduced bearish losses by 10\%. The results demonstrate that verbal feedback represents a new, scalable, and low-cost approach of tuning LLMs for financial goals.


GELATO: Multi-Instruction Trajectory Reshaping via Geometry-Aware Multiagent-based Orchestration

arXiv.org Artificial Intelligence

We present GELATO -- the first language-driven trajectory reshaping framework to embed geometric environment awareness and multi-agent feedback orchestration to support multi-instruction in human-robot interaction scenarios. Unlike prior learning-based methods, our approach automatically registers scene objects as 6D geometric primitives via a VLM-assisted multi-view pipeline, and an LLM translates free-form multiple instructions into explicit, verifiable geometric constraints. These are integrated into a geometric-aware vector field optimization to adapt initial trajectories while preserving smoothness, feasibility, and clearance. We further introduce a multi-agent orchestration with observer-based refinement to handle multi-instruction inputs and interactions among objectives -- increasing success rate without retraining. Simulation and real-world experiments demonstrate our method achieves smoother, safer, and more interpretable trajectory modifications compared to state-of-the-art baselines.


NetGent: Agent-Based Automation of Network Application Workflows

arXiv.org Artificial Intelligence

We present NetGent, an AI-agent framework for automating complex application workflows to generate realistic network traffic datasets. Developing generalizable ML models for networking requires data collection from network environments with traffic that results from a diverse set of real-world web applications. However, using existing browser automation tools that are diverse, repeatable, realistic, and efficient remains fragile and costly. NetGent addresses this challenge by allowing users to specify workflows as natural-language rules that define state-dependent actions. These abstract specifications are compiled into nondeterministic finite automata (NFAs), which a state synthesis component translates into reusable, executable code. This design enables deterministic replay, reduces redundant LLM calls through state caching, and adapts quickly when application interfaces change. In experiments, NetGent automated more than 50+ workflows spanning video-on-demand streaming, live video streaming, video conferencing, social media, and web scraping, producing realistic traffic traces while remaining robust to UI variability. By combining the flexibility of language-based agents with the reliability of compiled execution, NetGent provides a scalable foundation for generating the diverse, repeatable datasets needed to advance ML in networking.


Visual Document Understanding and Reasoning: A Multi-Agent Collaboration Framework with Agent-Wise Adaptive Test-Time Scaling

arXiv.org Artificial Intelligence

The dominant paradigm of monolithic scaling in Vision-Language Models (VLMs) is failing for understanding and reasoning in documents, yielding diminishing returns as it struggles with the inherent need of this domain for document-based procedural reasoning, cognitive complexity, and factual accuracy. T o this end, we introduce MACT, a Multi-Agent Collaboration framework with agent-wise adaptive Test-time scaling that pioneers a paradigm shift to procedural scaling, adapting dynamically to the functional entities of visual documents understanding and reasoning. MACT decomposes the visual document processing flow into four specialized agents, i.e., planning, execution, judgment, and answer, to resolve cognitive overload and introduce a critical self-correction loop for factual grounding. This collaborative architecture is amplified by an agent-wise adaptive test-time scaling strategy that intelligently allocates computational resources based on the complexity and redundancy of each functionality. Evaluated on multiple visual document understanding benchmarks, MACT achieves superior performance with a smaller parameter scale, adapting effectively to various document scenarios without compromising its general or mathematical reasoning capabilities. The three variants of MACT consistently attain top-three average performance rankings, with average performance enhancements of 9.9-11.5% over the base models. The source code will be released publicly.


Cautious Optimism: A Meta-Algorithm for Near-Constant Regret in General Games

arXiv.org Artificial Intelligence

We introduce Cautious Optimism, a framework for substantially faster regularized learning in general games. Cautious Optimism, as a variant of Optimism, adaptively controls the learning pace in a dynamic, non-monotone manner to accelerate no-regret learning dynamics. Cautious Optimism takes as input any instance of Follow-the-Regularized-Leader (FTRL) and outputs an accelerated no-regret learning algorithm (COFTRL) by pacing the underlying FTRL with minimal computational overhead. Importantly, it retains uncoupledness, that is, learners do not need to know other players' utilities. Cautious Optimistic FTRL (COFTRL) achieves near-optimal $O_T(\log T)$ regret in diverse self-play (mixing and matching regularizers) while preserving the optimal $O_T(\sqrt{T})$ regret in adversarial scenarios. In contrast to prior works (e.g., Syrgkanis et al. [2015], Daskalakis et al. [2021]), our analysis does not rely on monotonic step sizes, showcasing a novel route for fast learning in general games. Moreover, instances of COFTRL achieve new state-of-the-art regret minimization guarantees in general convex games, exponentially improving the dependence on the dimension of the action space $d$ over previous works [Farina et al., 2022a].