Xu, Ruitu
Taming Equilibrium Bias in Risk-Sensitive Multi-Agent Reinforcement Learning
Fei, Yingjie, Xu, Ruitu
Recent advancement in reinforcement learning research has witnessed much development on multiagent reinforcement learning (MARL). However, most of the works focus on risk-neutral agents, which may not be suitable for modeling the real world. For example, in investment activities, different investors have different risk preferences depending on their roles in the market. Some act as speculators and are risk-seeking, while others are bound by regulatory constraints and are thus risk-averse. Another example is multi-player online role-playing games, where each of the players can be considered an agent. Whereas some (risk-seeking) players enjoy exploring uncharted regions in the game, others (risk-averse players) prefer to playing in areas that are well explored and come with less uncertainty. It is not hard to see that in the above examples, modeling each agent as uniformly risk-neutral is inappropriate. This naturally calls for a more sophisticated modeling framework that takes into account of heterogeneous risk preferences of agents. In this paper, we study the problem of risk-sensitive MARL under the setting of general-sum Markov games (MGs), a more realistic multi-agent model in which the agents may take different risk preferences.
Finding Regularized Competitive Equilibria of Heterogeneous Agent Macroeconomic Models with Reinforcement Learning
Xu, Ruitu, Min, Yifei, Wang, Tianhao, Wang, Zhaoran, Jordan, Michael I., Yang, Zhuoran
We study a heterogeneous agent macroeconomic model with an infinite number of households and firms competing in a labor market. Each household earns income and engages in consumption at each time step while aiming to maximize a concave utility subject to the underlying market conditions. The households aim to find the optimal saving strategy that maximizes their discounted cumulative utility given the market condition, while the firms determine the market conditions through maximizing corporate profit based on the household population behavior. The model captures a wide range of applications in macroeconomic studies, and we propose a data-driven reinforcement learning framework that finds the regularized competitive equilibrium of the model. The proposed algorithm enjoys theoretical guarantees in converging to the equilibrium of the market at a sub-linear rate.
Meta Learning in the Continuous Time Limit
Xu, Ruitu, Chen, Lin, Karbasi, Amin
In this paper, we establish the ordinary differential equation (ODE) that underlies the training dynamics of Model-Agnostic Meta-Learning (MAML). Our continuous-time limit view of the process eliminates the influence of the manually chosen step size of gradient descent and includes the existing gradient descent training algorithm as a special case that results from a specific discretization. We show that the MAML ODE enjoys a linear convergence rate to an approximate stationary point of the MAML loss function for strongly convex task losses, even when the corresponding MAML loss is non-convex. Moreover, through the analysis of the MAML ODE, we propose a new BI-MAML training algorithm that significantly reduces the computational burden associated with existing MAML training methods. To complement our theoretical findings, we perform empirical experiments to showcase the superiority of our proposed methods with respect to the existing work.