Goto

Collaborating Authors

 bmg




Optimistic Meta-Gradients

arXiv.org Artificial Intelligence

We study the connection between gradient-based meta-learning and convex op-timisation. We observe that gradient descent with momentum is a special case of meta-gradients, and building on recent results in optimisation, we prove convergence rates for meta-learning in the single task setting. While a meta-learned update rule can yield faster convergence up to constant factor, it is not sufficient for acceleration. Instead, some form of optimism is required. We show that optimism in meta-learning can be captured through Bootstrapped Meta-Gradients (Flennerhag et al., 2022), providing deeper insight into its underlying mechanics.


DeepMind's Bootstrapped Meta-Learning Enables Meta Learners to Teach Themselves

#artificialintelligence

Learning how to learn is something most humans do well, by leveraging previous experiences to inform the learning processes for new tasks. Endowing AI systems with such abilities however remains challenging, as it requires the machine learners to learn update rules, which typically have been manually tuned for each task. The field of meta-learning studies how to enable machine learners to learn how to learn, and is a critical research area for improving the efficiency of AI agents. One of the approaches is for learners to learn an update rule by applying it on previous steps and then evaluating the corresponding performance. To fully unlock the potential of meta-learning, it is necessary to overcome both the meta-optimization problem and myopic meta objectives.


Bootstrapped Meta-Learning

arXiv.org Machine Learning

Meta-learning empowers artificial intelligence to increase its efficiency by learning how to learn. Unlocking this potential involves overcoming a challenging meta-optimisation problem that often exhibits ill-conditioning, and myopic meta-objectives. We propose an algorithm that tackles these issues by letting the meta-learner teach itself. The algorithm first bootstraps a target from the meta-learner, then optimises the meta-learner by minimising the distance to that target under a chosen (pseudo-)metric. Focusing on meta-learning with gradients, we establish conditions that guarantee performance improvements and show that the improvement is related to the target distance. Thus, by controlling curvature, the distance measure can be used to ease meta-optimization, for instance by reducing ill-conditioning. Further, the bootstrapping mechanism can extend the effective meta-learning horizon without requiring backpropagation through all updates. The algorithm is versatile and easy to implement. We achieve a new state-of-the art for model-free agents on the Atari ALE benchmark, improve upon MAML in few-shot learning, and demonstrate how our approach opens up new possibilities by meta-learning efficient exploration in a Q-learning agent.


On Markov Games Played by Bayesian and Boundedly-Rational Players

AAAI Conferences

We present a new game-theoretic framework in which Bayesian players with bounded rationality engage in a Markov game and each has private but incomplete information regarding other players' types. Instead of utilizing Harsanyi's abstract types and a common prior, we construct intentional player types whose structure is explicit and induces a {\em finite-level} belief hierarchy. We characterize an equilibrium in this game and establish the conditions for existence of the equilibrium. The computation of finding such equilibria is formalized as a constraint satisfaction problem and its effectiveness is demonstrated on two cooperative domains.


Bayesian Markov Games with Explicit Finite-Level Types

AAAI Conferences

In impromptu or ad hoc settings, participating players are precluded from precoordination. Subsequently, each player's own model is private and includes some uncertainty about the others' types or behaviors. Harsanyi's formulation of a Bayesian game lays emphasis on this uncertainty while the players each play exactly one turn. We propose a new game-theoretic framework where Bayesian players engage in a Markov game and each has private but imperfect information regarding other players' types. Consequently, we construct player types whose structure is explicit and includes a finite level belief hierarchy instead of utilizing Harsanyi's abstract types and a common prior distribution. We formalize this new framework and demonstrate its effectiveness on two standard ad hoc teamwork domains involving two or more ad hoc players.


Bayesian Markov Games with Explicit Finite-Level Types

AAAI Conferences

We present a new game-theoretic framework where Bayesian players engage in a Markov game and each has private but imperfect information regarding other players' types. Instead of utilizing Harsanyi's abstract types and a common prior distribution, we construct player types whose structure is explicit and induces a finite level belief hierarchy. We characterize equilibria in this game and formalize the computation of finding such equilibria as a constraint satisfaction problem. The effectiveness of the new framework is demonstrated on two ad hoc team work domains.