Goto

Collaborating Authors

 marginal cost


Can Vibe Coding Beat Graduate CS Students? An LLM vs. Human Coding Tournament on Market-driven Strategic Planning

Danassis, Panayiotis, Goel, Naman

arXiv.org Artificial Intelligence

The rapid proliferation of Large Language Models (LLMs) has revolutionized AI-assisted code generation. This rapid development of LLMs has outpaced our ability to properly benchmark them. Prevailing benchmarks emphasize unit-test pass rates and syntactic correctness. Such metrics understate the difficulty of many real-world problems that require planning, optimization, and strategic interaction. We introduce a multi-agent reasoning-driven benchmark based on a real-world logistics optimization problem (Auction, Pickup, and Delivery Problem) that couples competitive auctions with capacity-constrained routing. The benchmark requires building agents that can (i) bid strategically under uncertainty and (ii) optimize planners that deliver tasks while maximizing profit. We evaluate 40 LLM-coded agents (by a wide range of state-of-the-art LLMs under multiple prompting methodologies, including vibe coding) against 17 human-coded agents developed before the advent of LLMs. Our results over 12 double all-play-all tournaments and $\sim 40$k matches demonstrate (i) a clear superiority of human(graduate students)-coded agents: the top 5 spots are consistently won by human-coded agents, (ii) the majority of LLM-coded agents (33 out of 40) are beaten by very simple baselines, and (iii) given the best human solution as an input and prompted to improve upon, the best performing LLM makes the solution significantly worse instead of improving it. Our results highlight a gap in LLMs' ability to produce code that works competitively in the real-world, and motivate new evaluations that emphasize reasoning-driven code synthesis in real-world scenarios.


Many-Eyes and Sentinels in Selfish and Cooperative Groups

Pilgrim, Charlie, Bate, Andrew M, Sigalou, Anna, Aellen, Mélisande, Morford, Joe, Warren, Elizabeth, Krupenye, Christopher, Biro, Dora, Mann, Richard P

arXiv.org Artificial Intelligence

Collective vigilance describes how animals in groups benefit from the predator detection efforts of others. Empirical observations typically find either a many-eyes strategy with all (or many) group members maintaining a low level of individual vigilance, or a sentinel strategy with one (or a few) individuals maintaining a high level of individual vigilance while others do not. With a general analytical treatment that makes minimal assumptions, we show that these two strategies are alternate solutions to the same adaptive problem of balancing the costs of predation and vigilance. Which strategy is preferred depends on how costs scale with the level of individual vigilance: many-eyes strategies are preferred where costs of vigilance rise gently at low levels but become steeper at higher levels (convex; e.g. an open field); sentinel strategies are preferred where costs of vigilance rise steeply at low levels and then flatten out (concave; e.g. environments with vantage points). This same dichotomy emerges whether individuals act selfishly to optimise their own fitness or cooperatively to optimise group fitness. The model is extended to explain discrete behavioural switching between strategies and differential levels of vigilance such as edge effects.


Sovereign AI: Rethinking Autonomy in the Age of Global Interdependence

Singh, Shalabh Kumar, Sengupta, Shubhashis

arXiv.org Artificial Intelligence

Artificial intelligence (AI) is emerging as a foundational general-purpose technology, raising new dilemmas of sovereignty in an interconnected world. While governments seek greater control over it, the very foundations of AI--global data pipelines, semiconductor supply chains, open-source ecosystems, and international standards--resist enclosure. This paper develops a conceptual and formal framework for understanding sovereign AI as a continuum rather than a binary condition, balancing autonomy with interdependence. Drawing on classical theories, historical analogies, and contemporary debates on networked autonomy, we present a planner's model that identifies two policy heuristics: equalizing marginal returns across the four sovereignty pillars and setting openness where global benefits equal exposure risks. We apply the model to India, highlighting sovereign footholds in data, compute, and norms but weaker model autonomy. The near-term challenge is integration via coupled Data x Compute investment, lifecycle governance (ModelOps), and safeguarded procurement. We then apply the model to the Middle East (Saudi Arabia and the UAE), where large public investment in Arabic-first models and sovereign cloud implies high sovereignty weights, lower effective fiscal constraints, and strong Data x Compute complementarities. An interior openness setting with guardrails emerges as optimal. Across contexts, the lesson is that sovereignty in AI needs managed interdependence, not isolation.


HOB: A Holistically Optimized Bidding Strategy under Heterogeneous Auction Mechanisms with Organic Traffic

Li, Qi, Huang, Wendong, Ye, Qichen, Xu, Wutong, Wang, Cheems, Bai, Rongquan, Yuan, Wei, Wang, Guan, Yu, Chuan, Xu, Jian

arXiv.org Artificial Intelligence

The E-commerce advertising platforms typically sell commercial traffic through either second-price auction (SPA) or first-price auction (FPA). SPA was historically prevalent due to its dominant strategy incentive-compatible (DSIC) for bidders with quasi-linear utilities, especially when budgets are not a binding constraint, while FPA has gained more prominence for offering higher revenue potential to publishers and avoiding the possibility for discriminatory treatment in personalized reserve prices. Meanwhile, on the demand side, advertisers are increasingly adopting platform-wide marketing solutions akin to QuanZhanTui, shifting from spending budgets solely on commercial traffic to bidding on the entire traffic for the purpose of maximizing overall sales. For automated bidding systems, such a trend poses a critical challenge: determining optimal strategies across heterogeneous auction channels to fulfill diverse advertiser objectives, such as maximizing return (MaxReturn) or meeting target return on ad spend (TargetROAS). To overcome this challenge, this work makes two key contributions. First, we derive an efficient solution for optimal bidding under FPA channels, which takes into account the presence of organic traffic - traffic can be won for free. Second, we introduce a marginal cost alignment (MCA) strategy that provably secures bidding efficiency across heterogeneous auction mechanisms. To validate performance of our developed framework, we conduct comprehensive offline experiments on public datasets and large-scale online A/B testing, which demonstrate consistent improvements over existing methods.


Autonomous vehicles need social awareness to find optima in multi-agent reinforcement learning routing games

Psarou, Anastasia, Gorczyca, Łukasz, Gaweł, Dominik, Kucharski, Rafał

arXiv.org Artificial Intelligence

Previous work has shown that when multiple selfish Autonomous Vehicles (AVs) are introduced to future cities and start learning optimal routing strategies using Multi-Agent Reinforcement Learning (MARL), they may destabilize traffic systems, as they would require a significant amount of time to converge to the optimal solution, equivalent to years of real-world commuting. We demonstrate that moving beyond the selfish component in the reward significantly relieves this issue. If each AV, apart from minimizing its own travel time, aims to reduce its impact on the system, this will be beneficial not only for the system-wide performance but also for each individual player in this routing game. By introducing an intrinsic reward signal based on the marginal cost matrix, we significantly reduce training time and achieve convergence more reliably. Marginal cost quantifies the impact of each individual action (route-choice) on the system (total travel time). Including it as one of the components of the reward can reduce the degree of non-stationarity by aligning agents' objectives. Notably, the proposed counterfactual formulation preserves the system's equilibria and avoids oscillations. Our experiments show that training MARL algorithms with our novel reward formulation enables the agents to converge to the optimal solution, whereas the baseline algorithms fail to do so. We show these effects in both a toy network and the real-world network of Saint-Arnoult. Our results optimistically indicate that social awareness (i.e., including marginal costs in routing decisions) improves both the system-wide and individual performance of future urban systems with AVs.




LLMs for Supply Chain Management

Wang, Haojie, Jiang, Jiuyun, Hong, L. Jeff, Jiang, Guangxin

arXiv.org Artificial Intelligence

The development of large language models (LLMs) has provided new tools for research in supply chain management (SCM). In this paper, we introduce a retrieval-augmented generation (RAG) framework that dynamically integrates external knowledge into the inference process, and develop a domain-specialized SCM LLM, which demonstrates expert-level competence by passing standardized SCM examinations and beer game tests. We further employ the use of LLMs to conduct horizontal and vertical supply chain games, in order to analyze competition and cooperation within supply chains. Our experiments show that RAG significantly improves performance on SCM tasks. Moreover, game-theoretic analysis reveals that the LLM can reproduce insights from the classical SCM literature, while also uncovering novel behaviors and offering fresh perspectives on phenomena such as the bullwhip effect. This paper opens the door for exploring cooperation and competition for complex supply chain network through the lens of LLMs.


How competitive are pay-as-bid auction games?

Vanelli, Martina, Como, Giacomo, Fagnani, Fabio

arXiv.org Artificial Intelligence

We study the pay-as-bid auction game, a supply function model with discriminatory pricing and asymmetric firms. In this game, strategies are non-decreasing supply functions relating pric to quantity and the exact choice of the strategy space turns out to be a crucial issue: when it includes all non-decreasing continuous functions, pure-strategy Nash equilibria often fail to exist. To overcome this, we restrict the strategy space to the set of Lipschitz-continuous functions and we prove that Nash equilibria always exist (under standard concavity assumptions) and consist of functions that are affine on their own support and have slope equal to the maximum allowed Lipschitz constant. We further show that the Nash equilibrium is unique up to the market-clearing price when the demand is affine and the asymmetric marginal production costs are homogeneous in zero. For quadratic production costs, we derive a closed-form expression and we compute the limit as the allowed Lipschitz constant grows to infinity. Our results show that in the limit the pay-as-bid auction game achieves perfect competition with efficient allocation and induces a lower market-clearing price compared to supply function models based on uniform price auctions.


Strategic Collusion of LLM Agents: Market Division in Multi-Commodity Competitions

Lin, Ryan Y., Ojha, Siddhartha, Cai, Kevin, Chen, Maxwell F.

arXiv.org Artificial Intelligence

Machine-learning technologies are seeing increased deployment in real-world market scenarios. In this work, we explore the strategic behaviors of large language models (LLMs) when deployed as autonomous agents in multi-commodity markets, specifically within Cournot competition frameworks. We examine whether LLMs can independently engage in anti-competitive practices such as collusion or, more specifically, market division. Our findings demonstrate that LLMs can effectively monopolize specific commodities by dynamically adjusting their pricing and resource allocation strategies, thereby maximizing profitability without direct human input or explicit collusion commands. These results pose unique challenges and opportunities for businesses looking to integrate AI into strategic roles and for regulatory bodies tasked with maintaining fair and competitive markets. The study provides a foundation for further exploration into the ramifications of deferring high-stakes decisions to LLM-based agents.