Goto

Collaborating Authors

 dynamic strategy


More with Less: An Empirical Study of Turn-Control Strategies for Efficient Coding Agents

Gao, Pengfei, Peng, Chao

arXiv.org Artificial Intelligence

LLM-powered coding agents, which operate in iterative loops (turns) to solve software engineering tasks, are becoming increasingly powerful. However, their practical deployment is hindered by significant and unpredictable costs. This challenge arises from a combination of factors: quadratically growing token counts with each turn, the high price of models, the large number of turns required for real-world tasks, and the tendency of agents to take inefficient or unnecessary actions. While existing research focuses on optimizing individual turns, the strategic control of the total number of turns remains an underexplored area for managing agent performance and cost. To address this gap, we conduct a comprehensive empirical study on SWE-bench using three state-of-the-art models and evaluate the impact of three distinct turn-control strategies: an unrestricted baseline, a fixed-turn limit with reminders, and a novel dynamic-turn strategy that grants extensions on-demand. Our findings first reveal a fundamental trade-off in the unrestricted setting, where no single model excels across performance, cost, and turn efficiency. We then show that a fixed-turn limit, specifically at the 75th percentile of the baseline, serves as a "sweet spot", substantially reducing costs (by 24%-68%) with minimal impact on solve rates. Most significantly, the dynamic-turn strategy consistently outperforms fixed-limit approaches, achieving comparable or better solve rates while further reducing costs by an additional 12%-24% by intelligently allocating resources only to tasks that need them. This work provides the first systematic analysis of turn-control strategies, offering simple yet effective guidelines for developers to balance cost and efficacy. We demonstrate that dynamic resource allocation is a superior, easy-to-implement approach for deploying powerful yet economically viable coding agents.


Mean-Variance Portfolio Selection in Long-Term Investments with Unknown Distribution: Online Estimation, Risk Aversion under Ambiguity, and Universality of Algorithms

Lam, Duy Khanh

arXiv.org Artificial Intelligence

The standard approach for constructing a Mean-Variance portfolio involves estimating parameters for the model using collected samples. However, since the distribution of future data may not resemble that of the training set, the out-of-sample performance of the estimated portfolio is worse than one derived with true parameters, which has prompted several innovations for better estimation. Instead of treating the data without a timing aspect as in the common training-backtest approach, this paper adopts a perspective where data gradually and continuously reveal over time. The original model is recast into an online learning framework, which is free from any statistical assumptions, to propose a dynamic strategy of sequential portfolios such that its empirical utility, Sharpe ratio, and growth rate asymptotically achieve those of the true portfolio, derived with perfect knowledge of the future data. When the distribution of future data has a normal shape, the growth rate of wealth is shown to increase by lifting the portfolio along the efficient frontier through the calibration of risk aversion. Since risk aversion cannot be appropriately predetermined, another proposed algorithm updating this coefficient over time forms a dynamic strategy approaching the optimal empirical Sharpe ratio or growth rate associated with the true coefficient. The performance of these proposed strategies is universally guaranteed under specific stochastic markets. Furthermore, in stationary and ergodic markets, the so-called Bayesian strategy utilizing true conditional distributions, based on observed past market information during investment, almost surely does not perform better than the proposed strategies in terms of empirical utility, Sharpe ratio, or growth rate, which, in contrast, do not rely on conditional distributions.


Green AI in Action: Strategic Model Selection for Ensembles in Production

Nijkamp, Nienke, Sallou, June, van der Heijden, Niels, Cruz, Luís

arXiv.org Artificial Intelligence

Integrating Artificial Intelligence (AI) into software systems has significantly enhanced their capabilities while escalating energy demands. Ensemble learning, combining predictions from multiple models to form a single prediction, intensifies this problem due to cumulative energy consumption. This paper presents a novel approach to model selection that addresses the challenge of balancing the accuracy of AI models with their energy consumption in a live AI ensemble system. We explore how reducing the number of models or improving the efficiency of model usage within an ensemble during inference can reduce energy demands without substantially sacrificing accuracy. This study introduces and evaluates two model selection strategies, Static and Dynamic, for optimizing ensemble learning systems performance while minimizing energy usage. Our results demonstrate that the Static strategy improves the F1 score beyond the baseline, reducing average energy usage from 100\% from the full ensemble to 6\2%. The Dynamic strategy further enhances F1 scores, using on average 76\% compared to 100% of the full ensemble. Moreover, we propose an approach that balances accuracy with resource consumption, significantly reducing energy usage without substantially impacting accuracy. This method decreased the average energy usage of the Static strategy from approximately 62\% to 14\%, and for the Dynamic strategy, from around 76\% to 57\%. Our field study of Green AI using an operational AI system developed by a large professional services provider shows the practical applicability of adopting energy-conscious model selection strategies in live production environments.


LocalGCL: Local-aware Contrastive Learning for Graphs

Jiang, Haojun, Sun, Jiawei, Li, Jie, Wu, Chentao

arXiv.org Artificial Intelligence

The key idea of contrastive learning is to maximize the agreement between views generated from the Graph representation learning (GRL) makes considerable same data instance, while minimizing the agreement between progress recently, which encodes graphs with topological those from different instances. Given its potential in exploiting structures into low-dimensional embeddings. Meanwhile, the data relationships, recent efforts [10, 11, 12] have been time-consuming and costly process of annotating graph labels devoted to advancing contrastive learning to obtain general manually prompts the growth of self-supervised learning graph representations. For example, GraphCL [10] builds (SSL) techniques. As a dominant approach of SSL, Contrastive a graph contrastive learning framework with augmentations learning (CL) learns discriminative representations to learn robust graph representations.


Time-Series Classification for Dynamic Strategies in Multi-Step Forecasting

Green, Riku, Stevens, Grant, Filho, Telmo de Menezes e Silva, Abdallah, Zahraa

arXiv.org Artificial Intelligence

Multi-step forecasting (MSF) in time-series, the ability to make predictions multiple time steps into the future, is fundamental to almost all temporal domains. To make such forecasts, one must assume the recursive complexity of the temporal dynamics. Such assumptions are referred to as the forecasting strategy used to train a predictive model. Previous work shows that it is not clear which forecasting strategy is optimal a priori to evaluating on unseen data. Furthermore, current approaches to MSF use a single (fixed) forecasting strategy. In this paper, we characterise the instance-level variance of optimal forecasting strategies and propose Dynamic Strategies (DyStrat) for MSF. We experiment using 10 datasets from different scales, domains, and lengths of multi-step horizons. When using a random-forest-based classifier, DyStrat outperforms the best fixed strategy, which is not knowable a priori, 94% of the time, with an average reduction in mean-squared error of 11%. Our approach typically triples the top-1 accuracy compared to current approaches. Notably, we show DyStrat generalises well for any MSF task.


An Opponent-Aware Reinforcement Learning Method for Team-to-Team Multi-Vehicle Pursuit via Maximizing Mutual Information Indicator

Wang, Qinwen, Li, Xinhang, Yuan, Zheng, Yang, Yiying, Xu, Chen, Zhang, Lin

arXiv.org Artificial Intelligence

The pursuit-evasion game in Smart City brings a profound impact on the Multi-vehicle Pursuit (MVP) problem, when police cars cooperatively pursue suspected vehicles. Existing studies on the MVP problems tend to set evading vehicles to move randomly or in a fixed prescribed route. The opponent modeling method has proven considerable promise in tackling the non-stationary caused by the adversary agent. However, most of them focus on two-player competitive games and easy scenarios without the interference of environments. This paper considers a Team-to-Team Multi-vehicle Pursuit (T2TMVP) problem in the complicated urban traffic scene where the evading vehicles adopt the pre-trained dynamic strategies to execute decisions intelligently. To solve this problem, we propose an opponent-aware reinforcement learning via maximizing mutual information indicator (OARLM2I2) method to improve pursuit efficiency in the complicated environment. First, a sequential encoding-based opponents joint strategy modeling (SEOJSM) mechanism is proposed to generate evading vehicles' joint strategy model, which assists the multi-agent decision-making process based on deep Q-network (DQN). Then, we design a mutual information-united loss, simultaneously considering the reward fed back from the environment and the effectiveness of opponents' joint strategy model, to update pursuing vehicles' decision-making process. Extensive experiments based on SUMO demonstrate our method outperforms other baselines by 21.48% on average in reducing pursuit time. The code is available at \url{https://github.com/ANT-ITS/OARLM2I2}.


Dynamic Controllability of Controllable Conditional Temporal Problems with Uncertainty

Cui, Jing (The Australian National University and DATA61) | Haslum, Patrik (The Australian National University and DATA61)

AAAI Conferences

Dynamic Controllability (DC) of a Simple Temporal Problem with Uncertainty (STPU) uses a dynamic decision strategy, rather than a fixed schedule, to tackle temporal uncertainty. We extend this concept to the Controllable Conditional Temporal Problem with Uncertainty (CCTPU), which extends the STPU by conditioning temporal constraints on the assignment of controllable discrete variables. We define dynamic controllability of a CCTPU as the existence of a strategy that decides on both the values of discrete choice variables and the scheduling of controllable time points dynamically. This contrasts with previous work, which made a static assignment of choice variables and dynamic decisions over time points only. We propose an algorithm to find such a fully dynamic strategy. The algorithm computes the ''envelope'' of outcomes of temporal uncertainty in which a particular assignment of discrete variables is feasible, and aggregates these over all choices. When an aggregated envelope covers all uncertain situations of the CCTPU, the problem is dynamically controllable. However, the algorithm is not complete. Experiments on an existing set of CCTPU benchmarks show that there are cases in which making both discrete and temporal decisions dynamically it is feasible to satisfy the problem constraints, while assigning the discrete variables statically it is not.