Goto

Collaborating Authors

 Shock, Jonathan


Dispelling the Mirage of Progress in Offline MARL through Standardised Baselines and Evaluation

arXiv.org Artificial Intelligence

Offline multi-agent reinforcement learning (MARL) is an emerging field with great promise for real-world applications. Unfortunately, the current state of research in offline MARL is plagued by inconsistencies in baselines and evaluation protocols, which ultimately makes it difficult to accurately assess progress, trust newly proposed innovations, and allow researchers to easily build upon prior work. In this paper, we firstly identify significant shortcomings in existing methodologies for measuring the performance of novel algorithms through a representative study of published offline MARL work. Secondly, by directly comparing to this prior work, we demonstrate that simple, well-implemented baselines can achieve state-of-the-art (SOTA) results across a wide range of tasks. Specifically, we show that on 35 out of 47 datasets used in prior work (almost 75% of cases), we match or surpass the performance of the current purported SOTA. Strikingly, our baselines often substantially outperform these more sophisticated algorithms. Finally, we correct for the shortcomings highlighted from this prior work by introducing a straightforward standardised methodology for evaluation and by providing our baseline implementations with statistically robust results across several scenarios, useful for comparisons in future work. Our proposal includes simple and sensible steps that are easy to adopt, which in combination with solid baselines and comparative results, could substantially improve the overall rigour of empirical science in offline MARL moving forward.


Off-the-Grid MARL: Datasets with Baselines for Offline Multi-Agent Reinforcement Learning

arXiv.org Artificial Intelligence

Being able to harness the power of large datasets for developing cooperative multi-agent controllers promises to unlock enormous value for real-world applications. Many important industrial systems are multi-agent in nature and are difficult to model using bespoke simulators. However, in industry, distributed processes can often be recorded during operation, and large quantities of demonstrative data stored. Offline multi-agent reinforcement learning (MARL) provides a promising paradigm for building effective decentralised controllers from such datasets. However, offline MARL is still in its infancy and therefore lacks standardised benchmark datasets and baselines typically found in more mature subfields of reinforcement learning (RL). These deficiencies make it difficult for the community to sensibly measure progress. In this work, we aim to fill this gap by releasing off-the-grid MARL (OG-MARL): a growing repository of high-quality datasets with baselines for cooperative offline MARL research. Our datasets provide settings that are characteristic of real-world systems, including complex environment dynamics, heterogeneous agents, non-stationarity, many agents, partial observability, suboptimality, sparse rewards and demonstrated coordination. For each setting, we provide a range of different dataset types (e.g. Good, Medium, Poor, and Replay) and profile the composition of experiences for each dataset. We hope that OG-MARL will serve the community as a reliable source of datasets and help drive progress, while also providing an accessible entry point for researchers new to the field.


Reduce, Reuse, Recycle: Selective Reincarnation in Multi-Agent Reinforcement Learning

arXiv.org Artificial Intelligence

'Reincarnation' in reinforcement learning has been proposed as a formalisation of reusing prior computation from past experiments when training an agent in an environment. In this paper, we present a brief foray into the paradigm of reincarnation in the multi-agent (MA) context. We consider the case where only some agents are reincarnated, whereas the others are trained from scratch -- selective reincarnation. In the fully-cooperative MA setting with heterogeneous agents, we demonstrate that selective reincarnation can lead to higher returns than training fully from scratch, and faster convergence than training with full reincarnation. However, the choice of which agents to reincarnate in a heterogeneous system is vitally important to the outcome of the training -- in fact, a poor choice can lead to considerably worse results than the alternatives. We argue that a rich field of work exists here, and we hope that our effort catalyses further energy in bringing the topic of reincarnation to the multi-agent realm.


Brain Structural Saliency Over The Ages

arXiv.org Artificial Intelligence

Brain Age (BA) estimation via Deep Learning has become a strong and reliable bio-marker for brain health, but the black-box nature of Neural Networks does not easily allow insight into the features of brain ageing. We trained a ResNet model as a BA regressor on T1 structural MRI volumes from a small cross-sectional cohort of 524 individuals. Using Layer-wise Relevance Propagation (LRP) and DeepLIFT saliency mapping techniques, we analysed the trained model to determine the most relevant structures for brain ageing for the network, and compare these between the saliency mapping techniques. We show the change in attribution of relevance to different brain regions through the course of ageing. A tripartite pattern of relevance attribution to brain regions emerges. Some regions increase in relevance with age (e.g. the right Transverse Temporal Gyrus); some decrease in relevance with age (e.g. the right Fourth Ventricle); and others are consistently relevant across ages. We also examine the effect of the Brain Age Gap (BAG) on the distribution of relevance within the brain volume. It is hoped that these findings will provide clinically relevant region-wise trajectories for normal brain ageing, and a baseline against which to compare brain ageing trajectories.


Causal Multi-Agent Reinforcement Learning: Review and Open Problems

arXiv.org Artificial Intelligence

This paper serves to introduce the reader to the field of multi-agent reinforcement learning (MARL) and its intersection with methods from the study of causality. We highlight key challenges in MARL and discuss these in the context of how causal methods may assist in tackling them. We promote moving toward a 'causality first' perspective on MARL. Specifically, we argue that causality can offer improved safety, interpretability, and robustness, while also providing strong theoretical guarantees for emergent behaviour. We discuss potential solutions for common challenges, and use this context to motivate future research directions.