Park, Shinkyu
Optimal Strategy Revision in Population Games: A Mean Field Game Theory Perspective
Barreiro-Gomez, Julian, Park, Shinkyu
This paper investigates the design of optimal strategy revision in Population Games (PG) by establishing its connection to finite-state Mean Field Games (MFG). Specifically, by linking Evolutionary Dynamics (ED) -- which models agent decision-making in PG -- to the MFG framework, we demonstrate that optimal strategy revision can be derived by solving the forward Fokker-Planck (FP) equation and the backward Hamilton-Jacobi (HJ) equation, both central components of the MFG framework. Furthermore, we show that the resulting optimal strategy revision satisfies two key properties: positive correlation and Nash stationarity, which are essential for ensuring convergence to the Nash equilibrium. This convergence is then rigorously analyzed and established. Additionally, we discuss how different design objectives for the optimal strategy revision can recover existing ED models previously reported in the PG literature. Numerical examples are provided to illustrate the effectiveness and improved convergence properties of the optimal strategy revision design.
Learning Policies for Dynamic Coalition Formation in Multi-Robot Task Allocation
Bezerra, Lucas C. D., Santos, Ataíde M. G. dos, Park, Shinkyu
We propose a decentralized, learning-based framework for dynamic coalition formation in Multi-Robot Task Allocation (MRTA). Our approach extends Multi-Agent Proximal Policy Optimization (MAPPO) by incorporating spatial action maps, robot motion control, task allocation revision, and intention sharing to enable effective coalition formation. Extensive simulations demonstrate that our model significantly outperforms existing methods, including a market-based baseline. Furthermore, we assess the scalability and generalizability of the proposed framework, highlighting its ability to handle large robot populations and adapt to diverse task allocation environments.
TAB-Fields: A Maximum Entropy Framework for Mission-Aware Adversarial Planning
Puthumanaillam, Gokul, Song, Jae Hyuk, Yesmagambet, Nurzhan, Park, Shinkyu, Ornik, Melkior
Autonomous agents operating in adversarial scenarios face a fundamental challenge: while they may know their adversaries' high-level objectives, such as reaching specific destinations within time constraints, the exact policies these adversaries will employ remain unknown. Traditional approaches address this challenge by treating the adversary's state as a partially observable element, leading to a formulation as a Partially Observable Markov Decision Process (POMDP). However, the induced belief-space dynamics in a POMDP require knowledge of the system's transition dynamics, which, in this case, depend on the adversary's unknown policy. Our key observation is that while an adversary's exact policy is unknown, their behavior is necessarily constrained by their mission objectives and the physical environment, allowing us to characterize the space of possible behaviors without assuming specific policies. In this paper, we develop Task-Aware Behavior Fields (TAB-Fields), a representation that captures adversary state distributions over time by computing the most unbiased probability distribution consistent with known constraints. We construct TAB-Fields by solving a constrained optimization problem that minimizes additional assumptions about adversary behavior beyond mission and environmental requirements. We integrate TAB-Fields with standard planning algorithms by introducing TAB-conditioned POMCP, an adaptation of Partially Observable Monte Carlo Planning. Through experiments in simulation with underwater robots and hardware implementations with ground robots, we demonstrate that our approach achieves superior performance compared to baselines that either assume specific adversary policies or neglect mission constraints altogether. Evaluation videos and code are available at https://tab-fields.github.io.
Subassembly to Full Assembly: Effective Assembly Sequence Planning through Graph-based Reinforcement Learning
Shu, Chang, Kim, Anton, Park, Shinkyu
This paper proposes an assembly sequence planning framework, named Subassembly to Assembly (S2A). The framework is designed to enable a robotic manipulator to assemble multiple parts in a prespecified structure by leveraging object manipulation actions. The primary technical challenge lies in the exponentially increasing complexity of identifying a feasible assembly sequence as the number of parts grows. To address this, we introduce a graph-based reinforcement learning approach, where a graph attention network is trained using a delayed reward assignment strategy. In this strategy, rewards are assigned only when an assembly action contributes to the successful completion of the assembly task. We validate the framework's performance through physics-based simulations, comparing it against various baselines to emphasize the significance of the proposed reward assignment approach. Additionally, we demonstrate the feasibility of deploying our framework in a real-world robotic assembly scenario.
Learning Equilibrium with Estimated Payoffs in Population Games
Park, Shinkyu
We study a multi-agent decision problem in population games, where agents select from multiple available strategies and continually revise their selections based on the payoffs associated with these strategies. Unlike conventional population game formulations, we consider a scenario where agents must estimate the payoffs through local measurements and communication with their neighbors. By employing task allocation games -- dynamic extensions of conventional population games -- we examine how errors in payoff estimation by individual agents affect the convergence of the strategy revision process. Our main contribution is an analysis of how estimation errors impact the convergence of the agents' strategy profile to equilibrium. Based on the analytical results, we propose a design for a time-varying strategy revision rate to guarantee convergence. Simulation studies illustrate how the proposed method for updating the revision rate facilitates convergence to equilibrium.
Integrating Disambiguation and User Preferences into Large Language Models for Robot Motion Planning
Abugurain, Mohammed, Park, Shinkyu
This paper presents a framework that can interpret humans' navigation commands containing temporal elements and directly translate their natural language instructions into robot motion planning. Central to our framework is utilizing Large Language Models (LLMs). To enhance the reliability of LLMs in the framework and improve user experience, we propose methods to resolve the ambiguity in natural language instructions and capture user preferences. The process begins with an ambiguity classifier, identifying potential uncertainties in the instructions. Ambiguous statements trigger a GPT-4-based mechanism that generates clarifying questions, incorporating user responses for disambiguation. Also, the framework assesses and records user preferences for non-ambiguous instructions, enhancing future interactions. The last part of this process is the translation of disambiguated instructions into a robot motion plan using Linear Temporal Logic. This paper details the development of this framework and the evaluation of its performance in various test scenarios.
Threshold Decision-Making Dynamics Adaptive to Physical Constraints and Changing Environment
Amorim, Giovanna, Santos, María, Park, Shinkyu, Franci, Alessio, Leonard, Naomi Ehrich
We propose a threshold decision-making framework for controlling the physical dynamics of an agent switching between two spatial tasks. Our framework couples a nonlinear opinion dynamics model that represents the evolution of an agent's preference for a particular task with the physical dynamics of the agent. We prove the bifurcation that governs the behavior of the coupled dynamics. We show by means of the bifurcation behavior how the coupled dynamics are adaptive to the physical constraints of the agent. We also show how the bifurcation can be modulated to allow the agent to switch tasks based on thresholds adaptive to environmental conditions. We illustrate the benefits of the approach through a decentralized multi-robot task allocation application for trash collection.
Distributed Multi-Robot Multi-Target Tracking Using Heterogeneous Limited-Range Sensors
Chen, Jun, Abugurain, Mohammed, Dames, Philip, Park, Shinkyu
This paper presents a cooperative multi-robot multi-target tracking framework aimed at enhancing the efficiency of the heterogeneous sensor network and, consequently, improving overall target tracking accuracy. The concept of normalized unused sensing capacity is introduced to quantify the information a sensor is currently gathering relative to its theoretical maximum. This measurement can be computed using entirely local information and is applicable to various sensor models, distinguishing it from previous literature on the subject. It is then utilized to develop a distributed coverage control strategy for a heterogeneous sensor network, adaptively balancing the workload based on each sensor's current unused capacity. The algorithm is validated through a series of ROS and MATLAB simulations, demonstrating superior results compared to standard approaches that do not account for heterogeneity or current usage rates.
Proactive Opinion-Driven Robot Navigation around Human Movers
Cathcart, Charlotte, Santos, María, Park, Shinkyu, Leonard, Naomi Ehrich
Abstract-- We propose, analyze, and experimentally verify a new proactive approach for robot social navigation driven by the robot's "opinion" for which way and by how much to pass human movers crossing its path. The robot forms an opinion over time according to nonlinear dynamics that depend on the robot's observations of human movers and its level of attention to these social cues. For these dynamics, it is guaranteed that when the robot's attention is greater than a critical value, deadlock in decision making is broken, and the robot rapidly forms a strong opinion, passing each human A robot using opinion-driven navigation to pass two humans. Autonomous mobile robots are increasingly being used Once the robot passes a human, its opinion with respect to for tasks in settings such as warehouses and open public that human is no longer relevant; the opinion quickly returns spaces where they will encounter human movers. In order to its neutral value, allowing the robot to continue towards to accomplish their tasks in these settings, the robots need its destination. Likewise, the robot's attention also goes to to reliably and gracefully navigate around human movers.