shaper
Opponent Shaping in LLM Agents
Segura, Marta Emili Garcia, Hailes, Stephen, Musolesi, Mirco
Large Language Models (LLMs) are increasingly being deployed as autonomous agents in real-world environments. As these deployments scale, multi-agent interactions become inevitable, making it essential to understand strategic behavior in such systems. A central open question is whether LLM agents, like reinforcement learning agents, can shape the learning dynamics and influence the behavior of others through interaction alone. In this paper, we present the first investigation of opponent shaping (OS) with LLM-based agents. Existing OS algorithms cannot be directly applied to LLMs, as they require higher-order derivatives, face scalability constraints, or depend on architectural components that are absent in transformers. To address this gap, we introduce ShapeLLM, an adaptation of model-free OS methods tailored for transformer-based agents. Using ShapeLLM, we examine whether LLM agents can influence co-players' learning dynamics across diverse game-theoretic environments. We demonstrate that LLM agents can successfully guide opponents toward exploitable equilibria in competitive games (Iterated Prisoner's Dilemma, Matching Pennies, and Chicken) and promote coordination and improve collective welfare in cooperative games (Iterated Stag Hunt and a cooperative version of the Prisoner's Dilemma). Our findings show that LLM agents can both shape and be shaped through interaction, establishing opponent shaping as a key dimension of multi-agent LLM research.
Vibration Damping in Underactuated Cable-suspended Artwork -- Flying Belt Motion Control
Goubej, Martin, Clarke, Lauria, Hrabačka, Martin, Tolar, David
This paper presents a comprehensive refurbishment of the interactive robotic art installation Standards and Double Standards by Rafael Lozano-Hemmer. The installation features an array of belts suspended from the ceiling, each actuated by stepper motors and dynamically oriented by a vision-based tracking system that follows the movements of exhibition visitors. The original system was limited by oscillatory dynamics, resulting in torsional and pendulum-like vibrations that constrained rotational speed and reduced interactive responsiveness. To address these challenges, the refurbishment involved significant upgrades to both hardware and motion control algorithms. A detailed mathematical model of the flying belt system was developed to accurately capture its dynamic behavior, providing a foundation for advanced control design. An input shaping method, formulated as a convex optimization problem, was implemented to effectively suppress vibrations, enabling smoother and faster belt movements. Experimental results demonstrate substantial improvements in system performance and audience interaction. This work exemplifies the integration of robotics, control engineering, and interactive art, offering new solutions to technical challenges in real-time motion control and vibration damping for large-scale kinetic installations.
Opponent Shaping for Antibody Development
Towers, Sebastian, Kalisz, Aleksandra, Robert, Philippe A., Higueruelo, Alicia, Vianello, Francesca, Tsai, Ming-Han Chloe, Steel, Harrison, Foerster, Jakob N.
Anti-viral therapies are typically designed or evolved towards the current strains of a virus. In learning terms, this corresponds to a myopic best response, i.e., not considering the possible adaptive moves of the opponent. However, therapy-induced selective pressures act on viral antigens to drive the emergence of mutated strains, against which initial therapies have reduced efficacy. To motivate our work, we consider antibody designs that target not only the current viral strains but also the wide range of possible future variants that the virus might evolve into under the evolutionary pressure exerted by said antibodies. Building on a computational model of binding between antibodies and viral antigens (the Absolut! framework), we design and implement a genetic simulation of the viral evolutionary escape. Crucially, this allows our antibody optimisation algorithm to consider and influence the entire escape curve of the virus, i.e. to guide (or ''shape'') the viral evolution. This is inspired by opponent shaping which, in general-sum learning, accounts for the adaptation of the co-player rather than playing a myopic best response. Hence we call the optimised antibodies shapers. Within our simulations, we demonstrate that our shapers target both current and simulated future viral variants, outperforming the antibodies chosen in a myopic way. Furthermore, we show that shapers exert specific evolutionary pressure on the virus compared to myopic antibodies. Altogether, shapers modify the evolutionary trajectories of viral strains and minimise the viral escape compared to their myopic counterparts. While this is a simple model, we hope that our proposed paradigm will enable the discovery of better long-lived vaccines and antibody therapies in the future, enabled by rapid advancements in the capabilities of simulation tools.
Scaling Opponent Shaping to High Dimensional Games
Khan, Akbir, Willi, Timon, Kwan, Newton, Tacchetti, Andrea, Lu, Chris, Grefenstette, Edward, Rocktäschel, Tim, Foerster, Jakob
In multi-agent settings with mixed incentives, methods developed for zero-sum games have been shown to lead to detrimental outcomes. To address this issue, opponent shaping (OS) methods explicitly learn to influence the learning dynamics of co-players and empirically lead to improved individual and collective outcomes. However, OS methods have only been evaluated in low-dimensional environments due to the challenges associated with estimating higher-order derivatives or scaling model-free meta-learning. Alternative methods that scale to more complex settings either converge to undesirable solutions or rely on unrealistic assumptions about the environment or co-players. In this paper, we successfully scale an OS-based approach to general-sum games with temporally-extended actions and long-time horizons for the first time. After analysing the representations of the meta-state and history used by previous algorithms, we propose a simplified version called Shaper. We show empirically that Shaper leads to improved individual and collective outcomes in a range of challenging settings from literature. We further formalize a technique previously implicit in the literature, and analyse its contribution to opponent shaping. We show empirically that this technique is helpful for the functioning of prior methods in certain environments. Lastly, we show that previous environments, such as the CoinGame, are inadequate for analysing temporally-extended general-sum interactions.
Leading the Pack: N-player Opponent Shaping
Souly, Alexandra, Willi, Timon, Khan, Akbir, Kirk, Robert, Lu, Chris, Grefenstette, Edward, Rocktäschel, Tim
Reinforcement learning solutions have great success in the 2-player general sum setting. In this setting, the paradigm of Opponent Shaping (OS), in which agents account for the learning of their co-players, has led to agents which are able to avoid collectively bad outcomes, whilst also maximizing their reward. These methods have currently been limited to 2-player game. However, the real world involves interactions with many more agents, with interactions on both local and global scales. In this paper, we extend Opponent Shaping (OS) methods to environments involving multiple co-players and multiple shaping agents. We evaluate on over 4 different environments, varying the number of players from 3 to 5, and demonstrate that model-based OS methods converge to equilibrium with better global welfare than naive learning. However, we find that when playing with a large number of co-players, OS methods' relative performance reduces, suggesting that in the limit OS methods may not perform well. Finally, we explore scenarios where more than one OS method is present, noticing that within games requiring a majority of cooperating agents, OS methods converge to outcomes with poor global welfare.
Ingenious Power Tool Uses Machine Vision to Make Perfect Cuts
Peek into any of the commercial garages dotting San Francisco's Mission District and you'll find a mix of auto body shops and startups working on gleaming black-and-silver gizmos. On an intensely sunny afternoon in April, Shaper's staff rolled up its garage door to reveal a cluster of workbenches--all made by Shaper's handheld woodcutting tool, Origin, which goes on sale today. You may not think of yourself as a woodworker, but the founders of Shaper can change that. The tool is built to take the mystery--and most of the skill--out of cutting even complex shapes from a piece of wood. Grab Origin by the handles, place it on a piece of wood, and start tracing along the edges of the shape on Origin's touchscreen.
A computer-boosted power tool for craftsmen and creators
In a tidy workshop in San Francisco's Mission District, Joe Hebenstreit is surrounded by a curvy wooden chair, a carbon-fiber drone chassis, a beanbag-toss game, a copper bracelet, a carefully cut slab of kitchen counter top -- and the machine his company used to make all of them. Hebenstreit, formerly the lead design engineer of Google Glass computerized eyewear, now is chief executive of Shaper. The nine-person startup on Monday night began sales of a camera-enhanced, computer-guided cutting tool called the Origin. It tracks its own location so it can handle high-precision positioning as it guides you through a job. "It's like autocorrect for your hands," Hebenstreit said.