Meta-Learning Linear Quadratic Regulators: A Policy Gradient MAML Approach for the Model-free LQR

Toso, Leonardo F., Zhan, Donglin, Anderson, James, Wang, Han

Jan-25-2024–arXiv.org Artificial Intelligence

One of the main successes of Reinforcement Learning (RL) (for example, in the context of robotics) is its ability to learn control policies that rapidly adapt to different agents and environments (Wang et al., 2016; Duan et al., 2016; Rothfuss et al., 2018). This idea of learning a control policy that efficiently adapts to unseen RL tasks is referred to as meta-learning, or learning to learn. The most popular approach is the Model-Agnostic Meta-Learning (MAML) (Finn et al., 2017, 2019). In the context of RL, the role of MAML is to exploit task diversity of RL tasks drawn from a common task distribution to learn a control policy in a multi-task and heterogeneous setting that is only a few policy gradient (PG) steps away from an unseen task optimal policy. Despite its success in image classification and RL, more needs to be understood about the theoretical convergence guarantees of MAML for both model-based and model-free learning.

controller, heterogeneity, optimal controller, (14 more...)

arXiv.org Artificial Intelligence

Jan-25-2024

arXiv.org PDF

Add feedback

Country:
- North America > United States
  - New York (0.04)
  - California (0.04)

Genre:
- Research Report (0.50)

Technology:
- Information Technology > Artificial Intelligence
  - Representation & Reasoning > Agents (1.00)
  - Machine Learning (1.00)