Local Information Opponent Modelling Using Variational Autoencoders

Papoudakis, Georgios, Christianos, Filippos, Albrecht, Stefano V.

arXiv.org Machine Learning 

Modelling the behaviours of other agents (opponents) is essential for understanding how agents interact and making effective decisions. Existing methods for opponent modelling commonly assume knowledge of the local observations and chosen actions of the modelled opponents, which can significantly limit their applicability. We propose a new modelling technique based on variational autoencoders, which are trained to reconstruct the local actions and observations of the opponent based on embeddings which depend only on the local observations of the modelling agent (its observed world state, chosen actions, and received rewards). The embeddings are used to augment the modelling agent's decision policy which is trained via deep reinforcement learning; thus the policy does not require access to opponent observations. We provide a comprehensive evaluation and ablation study in diverse multi-agent tasks, showing that our method achieves comparable performance to an ideal baseline which has full access to opponent's information, and significantly higher returns than a baseline method which does not use the learned embeddings. An important aspect of autonomous decision-making agents is the ability to reason about the unknown intentions and behaviours of other agents. Much research has been devoted to this opponent modelling problem [2], with recent works focused on the use of deep learning architectures for opponent modelling and reinforcement learning (RL) [15, 27, 11, 26]. A common assumption in existing methods is that the modelling agent has access to the local trajectory of the modelled agents [2], which may include their local observations of the environment state, their past actions, and possibly their received rewards.

Duplicate Docs Excel Report

Title
None found

Similar Docs  Excel Report  more

TitleSimilaritySource
None found