Milzman, Jesse
Policies with Sparse Inter-Agent Dependencies in Dynamic Games: A Dynamic Programming Approach
Liu, Xinjie, Li, Jingqi, Fotiadis, Filippos, Karabag, Mustafa O., Milzman, Jesse, Fridovich-Keil, David, Topcu, Ufuk
Common feedback strategies in multi-agent dynamic games require all players' state information to compute control strategies. However, in real-world scenarios, sensing and communication limitations between agents make full state feedback expensive or impractical, and such strategies can become fragile when state information from other agents is inaccurate. To this end, we propose a regularized dynamic programming approach for finding sparse feedback policies that selectively depend on the states of a subset of agents in dynamic games. The proposed approach solves convex adaptive group Lasso problems to compute sparse policies approximating Nash equilibrium solutions. We prove the regularized solutions' asymptotic convergence to a neighborhood of Nash equilibrium policies in linear-quadratic (LQ) games. We extend the proposed approach to general non-LQ games via an iterative algorithm. Empirical results in multi-robot interaction scenarios show that the proposed approach effectively computes feedback policies with varying sparsity levels. When agents have noisy observations of other agents' states, simulation results indicate that the proposed regularized policies consistently achieve lower costs than standard Nash equilibrium policies by up to 77% for all interacting agents whose costs are coupled with other agents' states.
Smooth Information Gathering in Two-Player Noncooperative Games
Palafox, Fernando, Milzman, Jesse, Lee, Dong Ho, Park, Ryan, Fridovich-Keil, David
We present a mathematical framework for modeling two-player noncooperative games in which one player (the defender) is uncertain of the costs of the game and the second player's (the attacker's) intention but can preemptively allocate information-gathering resources to reduce this uncertainty. We obtain the defender's decisions by solving a two-stage problem. In Stage 1, the defender allocates information-gathering resources, and in Stage 2, the information-gathering resources output a signal that informs the defender about the costs of the game and the attacker's intent, and then both players play a noncooperative game. We provide a gradient-based algorithm to solve the two-stage game and apply this framework to a tower-defense game which can be interpreted as a variant of a Colonel Blotto game with smooth payoff functions and uncertainty over battlefield valuations. Finally, we analyze how optimal decisions shift with changes in information-gathering allocations and perturbations in the cost functions.
Measuring Multi-Source Redundancy in Factor Graphs
Milzman, Jesse, Harrison, Andre, Nieto-Granda, Carlos, Rogers, John
Factor graphs are a ubiquitous tool for multi-source inference in robotics and multi-sensor networks. They allow for heterogeneous measurements from many sources to be concurrently represented as factors in the state posterior distribution, so that inference can be conducted via sparse graphical methods. Adding measurements from many sources can supply robustness to state estimation, as seen in distributed pose graph optimization. However, adding excessive measurements to a factor graph can also quickly degrade their performance as more cycles are added to the graph. In both situations, the relevant quality is the redundancy of information. Drawing on recent work in information theory on partial information decomposition (PID), we articulate two potential definitions of redundancy in factor graphs, both within a common axiomatic framework for redundancy in factor graphs. This is the first application of PID to factor graphs, and only one of a few presenting quantitative measures of redundancy for them.
Decentralized core-periphery structure in social networks accelerates cultural innovation in agent-based model
Milzman, Jesse, Moser, Cody
Drawing on differing notions of core-periphery structure From a broad perspective, innovation is understood as a form of from [21] and [2], we distinguish decentralized core-periphery, collective problem-solving. For this and other reasons, the process centralized core-periphery, and affinity network structure. We generate of innovation is understood as a social process, as social collectives networks of these three classes from stochastic block models are capable of, and in some cases optimized for, both retaining (SBMs), and use them to run an agent-based model (ABM) of collective the knowledge of previous generations while building upon this cultural innovation, in which agents can only directly interact knowledge for subsequent innovations, a phenemonon we refer to with their network neighbors. In order to discover the highestscoring as "cumulative" culture [18, 19]. Human social networks tend to innovation, agents must discover and combine the highest exhibit core-periphery structures, whereby a'core' population is innovations from two completely parallel technology trees. We find heavily inter-connected, and connected in turn to more'peripheral' that decentralized core-periphery networks outperform both centralized individuals and subcommunities [2]. Prior work on the structure core-periphery networks and affinity networks, in terms of of human networks has suggested that innovation emerges at the mean crossover time for this final innovation. We hypothesize that boundary between the core and periphery of creative networks [4, decentralized core-periphery network structure provides a more 6]. Individual innovators are often in an intermediate position with fruitful environment for collective problem-solving, by allowing many core and peripheral connections, and successfully innovative for the relative shielding of periphery nodes from the optimal innovations teams tend to include both core and peripheral individuals [4].