Santos-Rodriguez, Raul
Q-learning Decision Transformer: Leveraging Dynamic Programming for Conditional Sequence Modelling in Offline RL
Yamagata, Taku, Khalil, Ahmed, Santos-Rodriguez, Raul
Recent works have shown that tackling offline reinforcement learning (RL) with a conditional policy produces promising results. The Decision Transformer (DT) combines the conditional policy approach and a transformer architecture, showing competitive performance against several benchmarks. However, DT lacks stitching ability -- one of the critical abilities for offline RL to learn the optimal policy from sub-optimal trajectories. This issue becomes particularly significant when the offline dataset only contains sub-optimal trajectories. On the other hand, the conventional RL approaches based on Dynamic Programming (such as Q-learning) do not have the same limitation; however, they suffer from unstable learning behaviours, especially when they rely on function approximation in an off-policy learning setting. In this paper, we propose the Q-learning Decision Transformer (QDT) to address the shortcomings of DT by leveraging the benefits of Dynamic Programming (Q-learning). It utilises the Dynamic Programming results to relabel the return-to-go in the training data to then train the DT with the relabelled data. Our approach efficiently exploits the benefits of these two approaches and compensates for each other's shortcomings to achieve better performance. We empirically show these in both simple toy environments and the more complex D4RL benchmark, showing competitive performance gains.
MIDI-Draw: Sketching to Control Melody Generation
Namgyal, Tashi, Flach, Peter, Santos-Rodriguez, Raul
We describe a proof-of-principle implementation of a system for drawing melodies that abstracts away from a note-level input representation via melodic contours. The aim is to allow users to express their musical intentions without requiring prior knowledge of how notes fit together melodiously. Current approaches to controllable melody generation often require users to choose parameters that are static across a whole sequence, via buttons or sliders. In contrast, our method allows users to quickly specify how parameters should change over time by drawing a contour.
Two-step counterfactual generation for OOD examples
Keshtmand, Nawid, Santos-Rodriguez, Raul, Lawry, Jonathan
However, they still make erroneous predictions when exposed to inputs from an unfamiliar distribution. This poses a significant obstacle to the deployment of ML models in safety-critical applications such as healthcare and autonomous vehicles. Consequently, for applications in these domains, two fundamental requirements for the deployment of ML models are; 1) being able to identify data that is from a different distribution from the data on which the model was trained, which is referred to as out-of-distribution (OOD) detection, outlier detection, or anomaly detection [30]; 2) being able to explain the prediction of the model [24]. There has been significant work on improving the accuracy of OOD detectors although, there has not been much work on explaining why a data point is OOD [20]. As OOD detection algorithms are increasingly used in safety-critical domains, providing explanations for high-stakes decisions has become an ethical and regulatory requirement [26]. Therefore, it is important to develop methods that provide both accurate OOD scores and also provide an explanation of why specific data points are detected as OOD. OOD detection can be considered a binary classification problem, where a data point can belong either to the in-distribution (ID) class or to the OOD class [4]. Additionally, there are different versions of the OOD detection problem, which are referred to as near-OOD and far-OOD detection [23, 29]. OOD data points that have neither non-discriminative (class-irrelevant) nor discriminative (class-relevant) features are referred to as far-OOD data and are therefore very dissimilar to the ID data.
When the Ground Truth is not True: Modelling Human Biases in Temporal Annotations
Yamagata, Taku, Tonkin, Emma L., Sanchez, Benjamin Arana, Craddock, Ian, Nieto, Miquel Perello, Santos-Rodriguez, Raul, Yang, Weisong, Flach, Peter
In supervised learning, low quality annotations lead to poorly performing classification and detection models, while also rendering evaluation unreliable. This is particularly apparent on temporal data, where annotation quality is affected by multiple factors. For example, in the post-hoc self-reporting of daily activities, cognitive biases are one of the most common ingredients. In particular, reporting the start and duration of an activity after its finalisation may incorporate biases introduced by personal time perceptions, as well as the imprecision and lack of granularity due to time rounding. Here we propose a method to model human biases on temporal annotations and argue for the use of soft labels. Experimental results in synthetic data show that soft labels provide a better approximation of the ground truth for several metrics. We showcase the method on a real dataset of daily activities.
Transfer Learning and Class Decomposition for Detecting the Cognitive Decline of Alzheimer Disease
Alwuthaynani, Maha M., Abdallah, Zahraa S., Santos-Rodriguez, Raul
Early diagnosis of Alzheimer's disease (AD) is essential in preventing the disease's progression. Therefore, detecting AD from neuroimaging data such as structural magnetic resonance imaging (sMRI) has been a topic of intense investigation in recent years. Deep learning has gained considerable attention in Alzheimer's detection. However, training a convolutional neural network from scratch is challenging since it demands more computational time and a significant amount of annotated data. By transferring knowledge learned from other image recognition tasks to medical image classification, transfer learning can provide a promising and effective solution. Irregularities in the dataset distribution present another difficulty. Class decomposition can tackle this issue by simplifying learning a dataset's class boundaries. Motivated by these approaches, this paper proposes a transfer learning method using class decomposition to detect Alzheimer's disease from sMRI images. We use two ImageNet-trained architectures: VGG19 and ResNet50, and an entropy-based technique to determine the most informative images. The proposed model achieved state-of-the-art performance in the Alzheimer's disease (AD) vs mild cognitive impairment (MCI) vs cognitively normal (CN) classification task with a 3\% increase in accuracy from what is reported in the literature.
Identification, explanation and clinical evaluation of hospital patient subtypes
Werner, Enrico, Clark, Jeffrey N., Bhamber, Ranjeet S., Ambler, Michael, Bourdeaux, Christopher P., Hepburn, Alexander, McWilliams, Christopher J., Santos-Rodriguez, Raul
Patients admitted to hospital constitute a heterogeneous population with different levels of illness severity, morbidities, response to treatments and outcomes [9]. Therefore, predicting the right treatment is challenging even when patients are initially diagnosed with the same conditions. For diagnosis and determining treatment options, physicians rely on factors including the patient's medical history [6], their own clinical experience and their professional intuition [9]. Advances in computing technologies and the introduction of electrical health records (EHR) mean that more information is available to physicians than ever before. However, hospitals are still in the process of transitioning from paper records to EHR, which leads to challenges when analyzing the data and inferring high-level information [6]. As intensive care units (ICUs) are the most data-rich hospital department, machine learning approaches have mostly focused on these environments [27, 9, 3, 19]. Recent progress has also been made for general wards [8, 21, 15, 10]. Outcome prediction and risk scoring are of high clinical importance. Several risk scoring methods have been developed and deployed, e.g.
Uncertainty Quantification of Surrogate Explanations: an Ordinal Consensus Approach
Schulz, Jonas, Poyiadzi, Rafael, Santos-Rodriguez, Raul
Explainability of black-box machine learning models is crucial, in particular when deployed in critical applications such as medicine or autonomous cars. Existing approaches produce explanations for the predictions of models, however, how to assess the quality and reliability of such explanations remains an open question. In this paper we take a step further in order to provide the practitioner with tools to judge the trustworthiness of an explanation. To this end, we produce estimates of the uncertainty of a given explanation by measuring the ordinal consensus amongst a set of diverse bootstrapped surrogate explainers. While we encourage diversity by using ensemble techniques, we propose and analyse metrics to aggregate the information contained within the set of explainers through a rating scheme. We empirically illustrate the properties of this approach through experiments on state-of-the-art Convolutional Neural Network ensembles. Furthermore, through tailored visualisations, we show specific examples of situations where uncertainty estimates offer concrete actionable insights to the user beyond those arising from standard surrogate explainers.
Understanding surrogate explanations: the interplay between complexity, fidelity and coverage
Poyiadzi, Rafael, Renard, Xavier, Laugel, Thibault, Santos-Rodriguez, Raul, Detyniecki, Marcin
This paper analyses the fundamental ingredients behind surrogate explanations to provide a better understanding of their inner workings. We start our exposition by considering global surrogates, describing the trade-off between complexity of the surrogate and fidelity to the black-box being modelled. We show that transitioning from global to local - reducing coverage - allows for more favourable conditions on the Pareto frontier of fidelity-complexity of a surrogate. We discuss the interplay between complexity, fidelity and coverage, and consider how different user needs can lead to problem formulations where these are either constraints or penalties. We also present experiments that demonstrate how the local surrogate interpretability procedure can be made interactive and lead to better explanations.
On the overlooked issue of defining explanation objectives for local-surrogate explainers
Poyiadzi, Rafael, Renard, Xavier, Laugel, Thibault, Santos-Rodriguez, Raul, Detyniecki, Marcin
Several methods exist that Yet, we argue that the current literature on model surrogates fit this description and share this goal. However, to explain a prediction lacks the clarity needed for a practitioner despite their shared overall procedure, they set out to make an informed choice on which method to use, different objectives, extract different information given explanation needs. Existing approaches usually lack from the black-box, and consequently produce transparency with regards to the explanation needs they propose diverse explanations, that are -in general-incomparable. to solve, on their specifications and ultimately on their In this work we review the similarities formal objectives. This situation (1) fuels a disseminated research and differences amongst multiple methods, with with propositions that are difficult to compare and (2) a particular focus on what information they extract prevents a sound development of the explainability practice.
Self-play Learning Strategies for Resource Assignment in Open-RAN Networks
Wang, Xiaoyang, Thomas, Jonathan D, Piechocki, Robert J, Kapoor, Shipra, Santos-Rodriguez, Raul, Parekh, Arjun
Open Radio Access Network (ORAN) is being developed with an aim to democratise access and lower the cost of future mobile data networks, supporting network services with various QoS requirements, such as massive IoT and URLLC. In ORAN, network functionality is dis-aggregated into remote units (RUs), distributed units (DUs) and central units (CUs), which allows flexible software on Commercial-Off-The-Shelf (COTS) deployments. Furthermore, the mapping of variable RU requirements to local mobile edge computing centres for future centralized processing would significantly reduce the power consumption in cellular networks. In this paper, we study the RU-DU resource assignment problem in an ORAN system, modelled as a 2D bin packing problem. A deep reinforcement learning-based self-play approach is proposed to achieve efficient RU-DU resource management, with AlphaGo Zero inspired neural Monte-Carlo Tree Search (MCTS). Experiments on representative 2D bin packing environment and real sites data show that the self-play learning strategy achieves intelligent RU-DU resource assignment for different network conditions.