Hu, Zechen
Deconfounding Time Series Forecasting
Gao, Wentao, Yang, Feiyu, Hong, Mengze, Du, Xiaojing, Hu, Zechen, Chen, Xiongren, Xu, Ziqi
Time series forecasting is a critical task in various domains, where accurate predictions can drive informed decision-making. Traditional forecasting methods often rely on current observations of variables to predict future outcomes, typically overlooking the influence of latent confounders, unobserved variables that simultaneously affect both the predictors and the target outcomes. This oversight can introduce bias and degrade the performance of predictive models. In this study, we address this challenge by proposing an enhanced forecasting approach that incorporates representations of latent confounders derived from historical data. By integrating these confounders into the predictive process, our method aims to improve the accuracy and robustness of time series forecasts. The proposed approach is demonstrated through its application to climate science data, showing significant improvements over traditional methods that do not account for confounders.
Learning Coordinated Maneuver in Adversarial Environments
Hu, Zechen, Limbu, Manshi, Shishika, Daigo, Xiao, Xuesu, Wang, Xuan
This paper aims to solve the coordination of a team of robots traversing a route in the presence of adversaries with random positions. Our goal is to minimize the overall cost of the team, which is determined by (i) the accumulated risk when robots stay in adversary-impacted zones and (ii) the mission completion time. During traversal, robots can reduce their speed and act as a `guard' (the slower, the better), which will decrease the risks certain adversary incurs. This leads to a trade-off between the robots' guarding behaviors and their travel speeds. The formulated problem is highly non-convex and cannot be efficiently solved by existing algorithms. Our approach includes a theoretical analysis of the robots' behaviors for the single-adversary case. As the scale of the problem expands, solving the optimal solution using optimization approaches is challenging, therefore, we employ reinforcement learning techniques by developing new encoding and policy-generating methods. Simulations demonstrate that our learning methods can efficiently produce team coordination behaviors. We discuss the reasoning behind these behaviors and explain why they reduce the overall team cost.
Bi-CL: A Reinforcement Learning Framework for Robots Coordination Through Bi-level Optimization
Hu, Zechen, Shishika, Daigo, Xiao, Xuesu, Wang, Xuan
In multi-robot systems, achieving coordinated missions remains a significant challenge due to the coupled nature of coordination behaviors and the lack of global information for individual robots. To mitigate these challenges, this paper introduces a novel approach, Bi-level Coordination Learning (Bi-CL), that leverages a bi-level optimization structure within a centralized training and decentralized execution paradigm. Our bi-level reformulation decomposes the original problem into a reinforcement learning level with reduced action space, and an imitation learning level that gains demonstrations from a global optimizer. Both levels contribute to improved learning efficiency and scalability. We note that robots' incomplete information leads to mismatches between the two levels of learning models. To address this, Bi-CL further integrates an alignment penalty mechanism, aiming to minimize the discrepancy between the two levels without degrading their training efficiency. We introduce a running example to conceptualize the problem formulation and apply Bi-CL to two variations of this example: route-based and graph-based scenarios. Simulation results demonstrate that Bi-CL can learn more efficiently and achieve comparable performance with traditional multi-agent reinforcement learning baselines for multi-robot coordination.
Scaling Team Coordination on Graphs with Reinforcement Learning
Limbu, Manshi, Hu, Zechen, Wang, Xuan, Shishika, Daigo, Xiao, Xuesu
This paper studies Reinforcement Learning (RL) techniques to enable team coordination behaviors in graph environments with support actions among teammates to reduce the costs of traversing certain risky edges in a centralized manner. While classical approaches can solve this non-standard multi-agent path planning problem by converting the original Environment Graph (EG) into a Joint State Graph (JSG) to implicitly incorporate the support actions, those methods do not scale well to large graphs and teams. To address this curse of dimensionality, we propose to use RL to enable agents to learn such graph traversal and teammate supporting behaviors in a data-driven manner. Specifically, through a new formulation of the team coordination on graphs with risky edges problem into Markov Decision Processes (MDPs) with a novel state and action space, we investigate how RL can solve it in two paradigms: First, we use RL for a team of agents to learn how to coordinate and reach the goal with minimal cost on a single EG. We show that RL efficiently solves problems with up to 20/4 or 25/3 nodes/agents, using a fraction of the time needed for JSG to solve such complex problems; Second, we learn a general RL policy for any $N$-node EGs to produce efficient supporting behaviors. We present extensive experiments and compare our RL approaches against their classical counterparts.
Team Coordination on Graphs with State-Dependent Edge Cost
Oughourli, Sara, Limbu, Manshi, Hu, Zechen, Wang, Xuan, Xiao, Xuesu, Shishika, Daigo
This paper studies a team coordination problem in a graph environment. Specifically, we incorporate "support" action which an agent can take to reduce the cost for its teammate to traverse some edges that have higher costs otherwise. Due to this added feature, the graph traversal is no longer a standard multi-agent path planning problem. To solve this new problem, we propose a novel formulation by posing it as a planning problem in the joint state space: the joint state graph (JSG). Since the edges of JSG implicitly incorporate the support actions taken by the agents, we are able to now optimize the joint actions by solving a standard single-agent path planning problem in JSG. One main drawback of this approach is the curse of dimensionality in both the number of agents and the size of the graph. To improve scalability in graph size, we further propose a hierarchical decomposition method to perform path planning in two levels. We provide complexity analysis as well as a statistical analysis to demonstrate the efficiency of our algorithm.