Goto

Collaborating Authors

 Country


Breast Cancer Diagnosis by Higher-Order Probabilistic Perceptrons

arXiv.org Machine Learning

A two-layer neural network model that systematically includes correlations among input variables to arbitrary order and is designed to implement Bayes inference has been adapted to classify breast cancer tumors as malignant or benign, assigning a probability for either outcome. The inputs to the network represent measured characteristics of cell nuclei imaged in Fine Needle Aspiration biopsies. The present machine-learning approach to diagnosis (known as HOPP, for higher-order probabilistic perceptron) is tested on the much-studied, open-access Breast Cancer Wisconsin (Diagnosis) Data Set of Wolberg et al. This set lists, for each tumor, measured physical parameters of the cell nuclei of each sample. The HOPP model can identify the key factors -- input features and their combinations -- most relevant for reliable diagnosis. HOPP networks were trained on 90\% of the examples in the Wisconsin database, and tested on the remaining 10\%. Referred to ensembles of 300 networks, selected randomly for cross-validation, accuracy of classification for the test sets of up to 97\% was readily achieved, with standard deviation around 2\%, together with average Matthews correlation coefficients reaching 0.94 indicating excellent predictive performance. Demonstrably, the HOPP is capable of matching the predictive power attained by other advanced machine-learning algorithms applied to this much-studied database, over several decades. Analysis shows that in this special problem, which is almost linearly separable, the effects of irreducible correlations among the measured features of the Wisconsin database are of relatively minor importance, as the Naive Bayes approximation can itself yield predictive accuracy approaching 95\%. The advantages of the HOPP algorithm will be more clearly revealed in application to more challenging machine-learning problems.


On the Apparent Conflict Between Individual and Group Fairness

arXiv.org Machine Learning

A distinction has been drawn in fair machine learning research between'group' and'individual' fairness measures. Many tec hnical research papers assume that both are important, but conflict ing, and propose ways to minimise the tradeoffs between these mea - sures. This paper argues that this apparent conflict is based on a misconception. It draws on theoretical discussions from within the fair machine learning research, and from political and legal philosophy, to argue that individual and group fairness are not fun da-mentally in conflict. First, it outlines accounts of egalita rian fairness which encompass plausible motivations for both group a nd individual fairness, thereby suggesting that there need be no conflict in principle. Second, it considers the concept of individual justice, from legal philosophy and jurisprudence which seems similar but actually contradicts the notion of individual fairness as proposed in the fair machine learning literature. The conclusi on is that the apparent conflict between individual and group fair ness is more of an artefact of the blunt application of fairness measures, rather than a matter of conflicting principles. In practice, this conflict may be resolved by a nuanced consideration of the sources of'unfairness' in a particular deployment context, and the ca refully justified application of measures to mitigate it.


Sensor Fusion using Backward Shortcut Connections for Sleep Apnea Detection in Multi-Modal Data

arXiv.org Machine Learning

Sleep apnea is a common respiratory disorder characterized by breathing pauses during the night. Consequences of untreated sleep apnea can be severe. Still, many people remain undiagnosed due to shortages of hospital beds and trained sleep technicians. To assist in the diagnosis process, automated detection methods are being developed. Recent works have demonstrated that deep learning models can extract useful information from raw respiratory data and that such models can be used as a robust sleep apnea detector. However, trained sleep technicians take into account multiple sensor signals when annotating sleep recordings instead of relying on a single respiratory estimate. To improve the predictive performance and reliability of the models, early and late sensor fusion methods are explored in this work. In addition, a novel late sensor fusion method is proposed which uses backward shortcut connections to improve the learning of the first stages of the models. The performance of these fusion methods is analyzed using CNN as well as LSTM deep learning base-models. The results demonstrate a significant and consistent improvement in predictive performance over the single sensor methods and over the other explored sensor fusion methods, by using the proposed sensor fusion method with backward shortcut connections.


Attending Form and Context to Generate Specialized Out-of-VocabularyWords Representations

arXiv.org Machine Learning

We propose a new contextual-compositional neural network layer that handles out-of-vocabulary (OOV) words in natural language processing (NLP) tagging tasks. This layer consists of a model that attends to both the character sequence and the context in which the OOV words appear. We show that our model learns to generate task-specific \textit{and} sentence-dependent OOV word representations without the need for pre-training on an embedding table, unlike previous attempts. We insert our layer in the state-of-the-art tagging model of \citet{plank2016multilingual} and thoroughly evaluate its contribution on 23 different languages on the task of jointly tagging part-of-speech and morphosyntactic attributes. Our OOV handling method successfully improves performances of this model on every language but one to achieve a new state-of-the-art on the Universal Dependencies Dataset 1.4.


Natural Actor-Critic Converges Globally for Hierarchical Linear Quadratic Regulator

arXiv.org Machine Learning

Multi-agent reinforcement learning has been successfully applied to a number of challenging problems. Despite these empirical successes, theoretical understanding of different algorithms is lacking, primarily due to the curse of dimensionality caused by the exponential growth of the state-action space with the number of agents. We study a fundamental problem of multi-agent linear quadratic regulator in a setting where the agents are partially exchangeable. In this setting, we develop a hierarchical actor-critic algorithm, whose computational complexity is independent of the total number of agents, and prove its global linear convergence to the optimal policy. As linear quadratic regulators are often used to approximate general dynamic systems, this paper provided an important step towards better understanding of general hierarchical mean-field multi-agent reinforcement learning.


Optimal PAC-Bayesian Posteriors for Stochastic Classifiers and their use for Choice of SVM Regularization Parameter

arXiv.org Machine Learning

PAC-Bayesian set up involves a stochastic classifier characterized by a posterior distribution on a classifier set, offers a high probability bound on its averaged true risk and is robust to the training sample used. For a given posterior, this bound captures the trade off between averaged empirical risk and KL-divergence based model complexity term. Our goal is to identify an optimal posterior with the least PAC-Bayesian bound. We consider a finite classifier set and 5 distance functions: KL-divergence, its Pinsker's and a sixth degree polynomial approximations; linear and squared distances. Linear distance based model results in a convex optimization problem. We obtain closed form expression for its optimal posterior. For uniform prior, this posterior has full support with weights negative-exponentially proportional to number of misclassifications. Squared distance and Pinsker's approximation bounds are possibly quasi-convex and are observed to have single local minimum. We derive fixed point equations (FPEs) using partial KKT system with strict positivity constraints. This obviates the combinatorial search for subset support of the optimal posterior. For uniform prior, exponential search on a full-dimensional simplex can be limited to an ordered subset of classifiers with increasing empirical risk values. These FPEs converge rapidly to a stationary point, even for a large classifier set when a solver fails. We apply these approaches to SVMs generated using a finite set of SVM regularization parameter values on 9 UCI datasets. These posteriors yield stochastic SVM classifiers with tight bounds. KL-divergence based bound is the tightest, but is computationally expensive due to non-convexity and multiple calls to a root finding algorithm. Optimal posteriors for all 5 distance functions have lowest 10% test error values on most datasets, with linear distance being the easiest to obtain.


Learning Deep Generative Models with Short Run Inference Dynamics

arXiv.org Machine Learning

This paper studies the fundamental problem of learning deep generative models that consist of one or more layers of latent variables organized in top-down architectures. Learning such a generative model requires inferring the latent variables for each training example based on the posterior distribution of these latent variables. The inference typically requires Markov chain Monte Caro (MCMC) that can be time consuming. In this paper, we propose to use short run inference dynamics guided by the log-posterior, such as finite-step gradient descent algorithm initialized from the prior distribution of the latent variables, as an approximate sampler of the posterior distribution, where the step size of the gradient descent dynamics is optimized by minimizing the Kullback-Leibler divergence between the distribution produced by the short run inference dynamics and the posterior distribution. Our experiments show that the proposed method outperforms variational auto-encoder (VAE) in terms of reconstruction error and synthesis quality. The advantage of the proposed method is that it is natural and automatic, even for models with multiple layers of latent variables.


Adapting Behaviour for Learning Progress

arXiv.org Artificial Intelligence

A BSTRACT Determining what experience to generate to best facilitate learning (i.e. The advent of distributed agents that interact with parallel instances of the environment has enabled larger scales and greater flexibility, but has not removed the need to tune exploration to the task, because the ideal data for the learning algorithm necessarily depends on its process of learning. We propose to dynamically adapt the data generation by using a non-stationary multi-armed bandit to optimize a proxy of the learning progress. The data distribution is controlled by modulating multiple parameters of the policy (such as stochasticity, consistency or optimism) without significant overhead. The adaptation speed of the bandit can be increased by exploiting the factored modulation structure. We demonstrate on a suite of Atari 2600 games how this unified approach produces results comparable to per-task tuning at a fraction of the cost. 1 I NTRODUCTION Reinforcement learning (RL) is a general formalism modelling sequential decision making, which supports making minimal assumptions about the task at hand and reducing the need for prior knowledge. By learning behaviour from scratch, RL agents have the potential to surpass human expertise or tackle complex domains where human intuition is not applicable. In practice, however, generality is often traded for performance and efficiency, with RL practitioners tuning algorithms, architectures and hyper-parameters to the task at hand (Hessel et al., 2019). A side-effect is that the resulting methods can be brittle, or difficult to reliably reproduce (Nagarajan et al., 2018). Exploration is one of the main aspects commonly designed or tuned specifically for the task being solved. Previous work has shown that large sample-efficiency gains are possible, for example, when the exploratory behaviour's level of stochasticity is adjusted to the environment's hazard rate (Garc ıa & Fern andez, 2015), or when an appropriate prior is used in large action spaces (Dulac-Arnold et al., 2015; Czarnecki et al., 2018; Vinyals et al., 2019). Exploration in the presence of function approximation should ideally be agent-centred. It ought to focus more on generating data that supports the agent's learning at its current parameters θ, rather than making progress on objective measurements of information gathering.


Spatial Influence-aware Reinforcement Learning for Intelligent Transportation System

arXiv.org Artificial Intelligence

Intelligent transportation systems (ITSs) are envisioned to be crucial for smart cities, which aims at improving traffic flow to improve the life quality of urban residents and reducing congestion to improve the efficiency of commuting. However, several challenges need to be resolved before such systems can be deployed, for example, conventional solutions for Markov decision process (MDP) and single-agent Reinforcement Learning (RL) algorithms suffer from poor scalability, and multi-agent systems suffer from poor communication and coordination. In this paper, we explore the potential of mutual information sharing, or in other words, spatial influence based communication, to optimize traffic light control policy. First, we mathematically analyze the transportation system. We conclude that the transportation system does not have stationary Nash Equilibrium, thereby reinforcement learning algorithms offer suitable solutions. Secondly, we describe how to build a multi-agent Deep Deterministic Policy Gradient (DDPG) system with spatial influence and social group utility incorporated. Then we utilize the grid topology road network to empirically demonstrate the scalability of the new system. We demonstrate three types of directed communications to show the effect of directions of social influence on the entire network utility and individual utility. Lastly, we define "selfish index" and analyze the effect of it on total group utility.


Resolving Congestions in the Air Traffic Management Domain via Multiagent Reinforcement Learning Methods

arXiv.org Artificial Intelligence

In this article, we report on the efficiency and effectiveness of multiagent reinforcement learning methods (MARL) for the computation of flight delays to resolve congestion problems in the Air Traffic Management (ATM) domain. Specifically, we aim to resolve cases where demand of airspace use exceeds capacity (demand-capacity problems), via imposing ground delays to flights at the pre-tactical stage of operations (i.e. few days to few hours before operation). Casting this into the multiagent domain, agents, representing flights, need to decide on own delays w.r.t. own preferences, having no information about others' payoffs, preferences and constraints, while they plan to execute their trajectories jointly with others, adhering to operational constraints. Specifically, we formalize the problem as a multiagent Markov Decision Process (MA-MDP) and we show that it can be considered as a Markov game in which interacting agents need to reach an equilibrium: What makes the problem more interesting is the dynamic setting in which agents operate, which is also due to the unforeseen, emergent effects of their decisions in the whole system. We propose collaborative multiagent reinforcement learning methods to resolve demand-capacity imbalances: Extensive experimental study on real-world cases, shows the potential of the proposed approaches in resolving problems, while advanced visualizations provide detailed views towards understanding the quality of solutions provided.