Agents
Cloud Infrastructure Management in the Age of AI Agents
Yang, Zhenning, Bhatnagar, Archit, Qiu, Yiming, Miao, Tongyuan, Kon, Patrick Tser Jern, Xiao, Yunming, Huang, Yibo, Casado, Martin, Chen, Ang
Cloud infrastructure is the cornerstone of the modern IT industry. However, managing this infrastructure effectively requires considerable manual effort from the DevOps engineering team. We make a case for developing AI agents powered by large language models (LLMs) to automate cloud infrastructure management tasks. In a preliminary study, we investigate the potential for AI agents to use different cloud/user interfaces such as software development kits (SDK), command line interfaces (CLI), Infrastructure-as-Code (IaC) platforms, and web portals. We report takeaways on their effectiveness on different management tasks, and identify research challenges and potential solutions.
A Collaborative Process Parameter Recommender System for Fleets of Networked Manufacturing Machines -- with Application to 3D Printing
Wang, Weishi, Guo, Sicong, Jiang, Chenhuan, Elidrisi, Mohamed, Lee, Myungjin, Madhyastha, Harsha V., Kontar, Raed Al, Okwudire, Chinedum E.
Fleets of networked manufacturing machines of the same type, that are collocated or geographically distributed, are growing in popularity. An excellent example is the rise of 3D printing farms, which consist of multiple networked 3D printers operating in parallel, enabling faster production and efficient mass customization. However, optimizing process parameters across a fleet of manufacturing machines, even of the same type, remains a challenge due to machine-to-machine variability. Traditional trial-and-error approaches are inefficient, requiring extensive testing to determine optimal process parameters for an entire fleet. In this work, we introduce a machine learning-based collaborative recommender system that optimizes process parameters for each machine in a fleet by modeling the problem as a sequential matrix completion task. These authors contributed equally to this work as lead authors. We validate our method using a mini 3D printing farm consisting of ten 3D printers for which we optimize acceleration and speed settings to maximize print quality and productivity. Our approach achieves significantly faster convergence to optimal process parameters compared to non-collaborative matrix completion. Introduction Manufacturing firms increasingly deploy fleets of machines (e.g., machine tools, industrial robots, or 3D printers) of the same type (i.e., the same make and model) that are connected using a computer network [1]. The machines could be collocated or geographically dispersed.
Reversing the Paradigm: Building AI-First Systems with Human Guidance
Spera, Cosimo, Agrawal, Garima
The relationship between humans and artificial intelligence is no longer science fiction -- it's a growing reality reshaping how we live and work. AI has moved beyond research labs into everyday life, powering customer service chats, personalizing travel, aiding doctors in diagnosis, and supporting educators. What makes this moment particularly compelling is AI's increasing collaborative nature. Rather than replacing humans, AI augments our capabilities -- automating routine tasks, enhancing decisions with data, and enabling creativity in fields like design, music, and writing. The future of work is shifting toward AI agents handling tasks autonomously, with humans as supervisors, strategists, and ethical stewards. This flips the traditional model: instead of humans using AI as a tool, intelligent agents will operate independently within constraints, managing everything from scheduling and customer service to complex workflows. Humans will guide and fine-tune these agents to ensure alignment with goals, values, and context. This shift offers major benefits -- greater efficiency, faster decisions, cost savings, and scalability. But it also brings risks: diminished human oversight, algorithmic bias, security flaws, and a widening skills gap. To navigate this transition, organizations must rethink roles, invest in upskilling, embed ethical principles, and promote transparency. This paper examines the technological and organizational changes needed to enable responsible adoption of AI-first systems -- where autonomy is balanced with human intent, oversight, and values.
EconGym: A Scalable AI Testbed with Diverse Economic Tasks
Mi, Qirui, Yang, Qipeng, Fan, Zijun, Fan, Wentian, Ma, Heyang, Ma, Chengdong, Xia, Siyu, An, Bo, Wang, Jun, Zhang, Haifeng
Artificial intelligence (AI) has become a powerful tool for economic research, enabling large-scale simulation and policy optimization. However, applying AI effectively requires simulation platforms for scalable training and evaluation-yet existing environments remain limited to simplified, narrowly scoped tasks, falling short of capturing complex economic challenges such as demographic shifts, multi-government coordination, and large-scale agent interactions. To address this gap, we introduce EconGym, a scalable and modular testbed that connects diverse economic tasks with AI algorithms. Grounded in rigorous economic modeling, EconGym implements 11 heterogeneous role types (e.g., households, firms, banks, governments), their interaction mechanisms, and agent models with well-defined observations, actions, and rewards. Users can flexibly compose economic roles with diverse agent algorithms to simulate rich multi-agent trajectories across 25+ economic tasks for AI-driven policy learning and analysis. Experiments show that EconGym supports diverse and cross-domain tasks-such as coordinating fiscal, pension, and monetary policies-and enables benchmarking across AI, economic methods, and hybrids. Results indicate that richer task composition and algorithm diversity expand the policy space, while AI agents guided by classical economic methods perform best in complex settings. EconGym also scales to 10k agents with high realism and efficiency.
Wanting to Be Understood Explains the Meta-Problem of Consciousness
Fernando, Chrisantha, Banarse, Dylan, Osindero, Simon
Because we are highly motivated to be understood, we created public external representations -- mime, language, art -- to externalise our inner states. We argue that such external representations are a pre-condition for access consciousness, the global availability of information for reasoning. Yet the bandwidth of access consciousness is tiny compared with the richness of `raw experience', so no external representation can reproduce that richness in full. Ordinarily an explanation of experience need only let an audience `grasp' the relevant pattern, not relive the phenomenon. But our drive to be understood, and our low level sensorimotor capacities for `grasping' so rich, that the demand for an explanation of the feel of experience cannot be ``satisfactory''. That inflated epistemic demand (the preeminence of our expectation that we could be perfectly understood by another or ourselves) rather than an irreducible metaphysical gulf -- keeps the hard problem of consciousness alive. But on the plus side, it seems we will simply never give up creating new ways to communicate and think about our experiences. In this view, to be consciously aware is to strive to have one's agency understood by oneself and others.
Modeling Earth-Scale Human-Like Societies with One Billion Agents
Guan, Haoxiang, He, Jiyan, Fan, Liyang, Ren, Zhenzhen, He, Shaobin, Yu, Xin, Chen, Yuan, Zheng, Shuxin, Liu, Tie-Yan, Liu, Zhen
Understanding how complex societal behaviors emerge from individual cognition and interactions requires both high-fidelity modeling of human behavior and large-scale simulations. Traditional agent-based models (ABMs) have been employed to study these dynamics for decades, but are constrained by simplified agent behaviors that fail to capture human complexity. Recent advances in large language models (LLMs) offer new opportunities by enabling agents to exhibit sophisticated social behaviors that go beyond rule-based logic, yet face significant scaling challenges. Here we present Light Society, an agent-based simulation framework that advances both fronts, efficiently modeling human-like societies at planetary scale powered by LLMs. Light Society formalizes social processes as structured transitions of agent and environment states, governed by a set of LLM-powered simulation operations, and executed through an event queue. This modular design supports both independent and joint component optimization, supporting efficient simulation of societies with over one billion agents. Large-scale simulations of trust games and opinion propagation--spanning up to one billion agents--demonstrate Light Society's high fidelity and efficiency in modeling social trust and information diffusion, while revealing scaling laws whereby larger simulations yield more stable and realistic emergent behaviors.
EMERGENT: Efficient and Manipulation-resistant Matching using GFlowNets
Tasnim, Mayesha, Acar, Erman, Ghebreab, Sennay
The design of fair and efficient algorithms for allocating public resources, such as school admissions, housing, or medical residency, has a profound social impact. In one-sided matching problems, where individuals are assigned to items based on ranked preferences, a fundamental trade-off exists between efficiency and strategyproofness. Existing algorithms like Random Serial Dictatorship (RSD), Probabilistic Serial (PS), and Rank Minimization (RM) capture only one side of this trade-off: RSD is strategyproof but inefficient, while PS and RM are efficient but incentivize manipulation. We propose EMERGENT, a novel application of Generative Flow Networks (GFlowNets) to one-sided matching, leveraging its ability to sample diverse, high-reward solutions. In our approach, efficient and manipulation-resistant matches emerge naturally: high-reward solutions yield efficient matches, while the stochasticity of GFlowNets-based outputs reduces incentives for manipulation. Experiments show that EMERGENT outperforms RSD in rank efficiency while significantly reducing strategic vulnerability compared to matches produced by RM and PS. Our work highlights the potential of GFlowNets for applications involving social choice mechanisms, where it is crucial to balance efficiency and manipulability.
Towards Full-Scenario Safety Evaluation of Automated Vehicles: A Volume-Based Method
Zhou, Hang, Ma, Chengyuan, Shen, Shiyu, Liang, Zhaohui, Li, Xiaopeng
With the rapid development of automated vehicles (AVs) in recent years, commercially available AVs are increasingly demonstrating high-level automation capabilities. However, most existing AV safety evaluation methods are primarily designed for simple maneuvers such as car-following and lane-changing. While suitable for basic tests, these methods are insufficient for assessing high-level automation functions deployed in more complex environments. First, these methods typically use crash rate as the evaluation metric, whose accuracy heavily depends on the quality and completeness of naturalistic driving environment data used to estimate scenario probabilities. Such data is often difficult and expensive to collect. Second, when applied to diverse scenarios, these methods suffer from the curse of dimensionality, making large-scale evaluation computationally intractable. To address these challenges, this paper proposes a novel framework for full-scenario AV safety evaluation. A unified model is first introduced to standardize the representation of diverse driving scenarios. This modeling approach constrains the dimension of most scenarios to a regular highway setting with three lanes and six surrounding background vehicles, significantly reducing dimensionality. To further avoid the limitations of probability-based method, we propose a volume-based evaluation method that quantifies the proportion of risky scenarios within the entire scenario space. For car-following scenarios, we prove that the set of safe scenarios is convex under specific settings, enabling exact volume computation. Experimental results validate the effectiveness of the proposed volume-based method using both AV behavior models from existing literature and six production AV models calibrated from field-test trajectory data in the Ultra-AV dataset. Code and data will be made publicly available upon acceptance of this paper.
Requirements for Recognition and Rapid Response to Unfamiliar Events Outside of Agent Design Scope
Wray, Robert E., Jones, Steven J., Laird, John E.
Regardless of past learning, an agent in an open world will face unfamiliar events outside of prior experience, existing models, or policies. Further, the agent will sometimes lack relevant knowledge and/or sufficient time to assess the situation and evaluate response options. How can an agent respond reasonably to situations that are outside of its original design scope? How can it recognize such situations sufficiently quickly and reliably to determine reasonable, adaptive courses of action? We identify key characteristics needed for solutions, review the state-of-the-art, and outline a proposed, novel approach that combines domain-general meta-knowledge (inspired by human cognition) and metareason-ing. This approach offers potential for fast, adaptive responses to unfamiliar situations, more fully meeting the performance characteristics required for open-world, general agents.
Towards a Formal Specification for Self-organized Shape Formation in Swarm Robotics
The self-organization of robots for the formation of structures and shapes is a stimulating application of the swarm robotic system. It involves a large number of autonomous robots of heterogeneous behavior, coordination among them, and their interaction with the dynamic environment. This process of complex structure formation is considered a complex system, which needs to be modeled by using any modeling approach. Although the formal specification approach along with other formal methods has been used to model the behavior of robots in a swarm. However, to the best of our knowledge, the formal specification approach has not been used to model the self-organization process in swarm robotic systems for shape formation. In this paper, we use a formal specification approach to model the shape formation task of swarm robots. We use Z (Zed) language of formal specification, which is a state-based language, to model the states of the entities of the systems. We demonstrate the effectiveness of Z for the self-organized shape formation. The presented formal specification model gives the outlines for designing and implementing the swarm robotic system for the formation of complex shapes and structures. It also provides the foundation for modeling the complex shape formation process for swarm robotics using a multi-agent system in a simulation-based environment. Keywords: Swarm robotics, Self-organization, Formal specification, Complex systems