Country
A Bayesian Approach to Modelling Longitudinal Data in Electronic Health Records
Bellot, Alexis, van der Schaar, Mihaela
Analyzing electronic health records (EHR) poses significant challenges because often few samples are available describing a patient's health and, when available, their information content is highly diverse. The problem we consider is how to integrate sparsely sampled longitudinal data, missing measurements informative of the underlying health status and fixed demographic information to produce estimated survival distributions updated through a patient's follow up. We propose a nonparametric probabilistic model that generates survival trajectories from an ensemble of Bayesian trees that learns variable interactions over time without specifying beforehand the longitudinal process. We show performance improvements on Primary Biliary Cirrhosis patient data.
Spiking Networks for Improved Cognitive Abilities of Edge Computing Devices
Akusok, Anton, Bjรถrk, Kaj-Mikael, Leal, Leonardo Espinosa, Miche, Yoan, Hu, Renjie, Lendasse, Amaury
A sudden realization came to our minds while preparing this white paper - mobile phones are the first type of devices that received dedicated math accelerators at a pervasive scale. Such things never got wide adoption before: Intel 8087 co-processor[11], Intel Xeon Phi[2, 5] or Google TPU (Tensor Processing Unit)[6] stayed niche devices that few people use and even fewer develop for. But since the last two years, major mobile phone companies include dedicated co-processors[4] necessary for computational photography enhancement or facial recognition, that are suitable for general machine learning. Currently the dominant analytical approach stores data and runs computations in the Cloud[12]. However Cloud based methods poorly fit to a range of important practical applications including augmented reality, real-time data analysis, real-time user interaction, or processing sensitive data that incur high risks for a company if leaked, stolen or intercepted in transfer. The price of deployed analytical methods is increased by the need to have a permanently working internet connection for users, and cloud hardware rent for service providers.
A Maximum Entropy approach to Massive Graph Spectra
Granziol, Diego, Ru, Robin, Zohren, Stefan, Dong, Xiaowen, Osborne, Michael, Roberts, Stephen
Machine Learning Research Group and Oxford-Man Institute for Quantitative Finance, Department of Engineering Science, University of Oxford Abstract Graph spectral techniques for measuring graph similarity, or for learning the cluster number, require kernel smoothing. The choice of kernel function and bandwidth are typically chosen in an ad-hoc manner and heavily affect the resulting output. We prove that kernel smoothing biases the moments of the spectral density. We propose an information theoretically optimal approach to learn a smooth graph spectral density, which fully respects the moment information. Our method's computational cost is linear in the number of edges, and hence can be applied to large networks, with millions of nodes. We apply our method to the problems to graph similarity and cluster number learning, where we outperform comparable iterative spectral approaches on synthetic and real graphs. Keywords: Networks, Information Theory, Maximum Entropy, Graph Spectral Theory, Random matrix theory, iterative methods, kernel smoothing 1. Introduction: networks, their graph spectra and importance Many systems of interest can be naturally characterised by complex networks; examples include social networks (Mislove et al., 2007b; Flake et al., 2000; Leskovec et al., 2007), biological networks (Palla et al., 2005) and technological networks.
Reducing Selection Bias in Counterfactual Reasoning for Individual Treatment Effects Estimation
Zhang, Zichen, Lan, Qingfeng, Ding, Lei, Wang, Yue, Hassanpour, Negar, Greiner, Russell
Counterfactual reasoning is an important paradigm applicable in many fields, such as healthcare, economics, and education. In this work, we propose a novel method to address the issue of \textit{selection bias}. We learn two groups of latent random variables, where one group corresponds to variables that only cause selection bias, and the other group is relevant for outcome prediction. They are learned by an auto-encoder where an additional regularized loss based on Pearson Correlation Coefficient (PCC) encourages the de-correlation between the two groups of random variables. This allows for explicitly alleviating selection bias by only keeping the latent variables that are relevant for estimating individual treatment effects. Experimental results on a synthetic toy dataset and a benchmark dataset show that our algorithm is able to achieve state-of-the-art performance and improve the result of its counterpart that does not explicitly model the selection bias.
TransMatch: A Transfer-Learning Scheme for Semi-Supervised Few-Shot Learning
Yu, Zhongjie, Chen, Lin, Cheng, Zhongwei, Luo, Jiebo
The successful application of deep learning to many visual recognition tasks relies heavily on the availability of a large amount of labeled data which is usually expensive to obtain. The few-shot learning problem has attracted increasing attention from researchers for building a robust model upon only a few labeled samples. Most existing works tackle this problem under the meta-learning framework by mimicking the few-shot learning task with an episodic training strategy. In this paper, we propose a new transfer-learning framework for semi-supervised few-shot learning to fully utilize the auxiliary information from labeled base-class data and unlabeled novel-class data. The framework consists of three components: 1) pre-training a feature extractor on base-class data; 2) using the feature extractor to initialize the classifier weights for the novel classes; and 3) further updating the model with a semi-supervised learning method. Under the proposed framework, we develop a novel method for semi-supervised few-shot learning called TransMatch by instantiating the three components with Imprinting and MixMatch. Extensive experiments on two popular benchmark datasets for few-shot learning, CUB-200-2011 and miniImageNet, demonstrate that our proposed method can effectively utilize the auxiliary information from labeled base-class data and unlabeled novel-class data to significantly improve the accuracy of few-shot learning task.
Bounded Manifold Completion
Gajamannage, Kelum, Paffenroth, Randy
Nonlinear dimensionality reduction or, equivalently, the approximation of high-dimensional data using a low-dimensional nonlinear manifold is an active area of research. In this paper, we will present a thematically different approach to detect the existence of a low-dimensional manifold of a given dimension that lies within a set of bounds derived from a given point cloud. A matrix representing the appropriately defined distances on a low-dimensional manifold is low-rank, and our method is based on current techniques for recovering a partially observed matrix from a small set of fully observed entries that can be implemented as a low-rank Matrix Completion (MC) problem. MC methods are currently used to solve challenging real-world problems, such as image inpainting and recommender systems, and we leverage extent efficient optimization techniques that use a nuclear norm convex relaxation as a surrogate for non-convex and discontinuous rank minimization. Our proposed method provides several advantages over current nonlinear dimensionality reduction techniques, with the two most important being theoretical guarantees on the detection of low-dimensional embeddings and robustness to non-uniformity in the sampling of the manifold. We validate the performance of this approach using both a theoretical analysis as well as synthetic and real-world benchmark datasets.
Deep Radar Waveform Design for Efficient Automotive Radar Sensing
Khobahi, Shahin, Bose, Arindam, Soltanalian, Mojtaba
In radar systems, unimodular (or constant-modulus) waveform design plays an important role in achieving better clutter/interference rejection, as well as a more accurate estimation of the target parameters. The design of such sequences has been studied widely in the last few decades, with most design algorithms requiring sophisticated a priori knowledge of environmental parameters which may be difficult to obtain in real-time scenarios. In this paper, we propose a novel hybrid model-driven and data-driven architecture that adapts to the ever changing environment and allows for adaptive unimodular waveform design. In particular, the approach lays the groundwork for developing extremely low-cost waveform design and processing frameworks for radar systems deployed in autonomous vehicles. The proposed model-based deep architecture imitates a well-known unimodular signal design algorithm in its structure, and can quickly infer statistical information from the environment using the observed data. Our numerical experiments portray the advantages of using the proposed method for efficient radar waveform design in time-varying environments.
Soft Q-network
Liu, Jingbin, Gu, Xinyang, Liu, Shuai, Zhang, Dexiang
When DQN is announced by deepmind in 2013, the whole world is surprised by the simplicity and promising result, but due to the low efficiency and stability of this method, it is hard to solve many problems. After all these years, people purposed more and more complicated ideas for improving, many of them use distributed Deep-RL which needs tons of cores to run the simulators. However, the basic ideas behind all this technique are sometimes just a modified DQN. So we asked a simple question, is there a more elegant way to improve the DQN model? Instead of adding more and more small fixes on it, we redesign the problem setting under a popular entropy regularization framework which leads to better performance and theoretical guarantee. Finally, we purposed SQN, a new off-policy algorithm with better performance and stability.
Uncertainty-sensitive Learning and Planning with Ensembles
Miลoล, Piotr, Kuciลski, ลukasz, Czechowski, Konrad, Kozakowski, Piotr, Klimek, Maciek
We propose a reinforcement learning framework for discrete environments in which an agent makes both strategic and tactical decisions. The former manifests itself through the use of value function, while the latter is powered by a tree search planner. These tools complement each other. The planning module performs a local \textit{what-if} analysis, which allows to avoid tactical pitfalls and boost backups of the value function. The value function, being global in nature, compensates for inherent locality of the planner. In order to further solidify this synergy, we introduce an exploration mechanism with two distinctive components: uncertainty modelling and risk measurement. To model the uncertainty we use value function ensembles, and to reflect risk we use propose several functionals that summarize the implied by the ensemble. We show that our method performs well on hard exploration environments: Deep-sea, toy Montezuma's Revenge, and Sokoban. In all the cases, we obtain speed-up in learning and boost in performance.
Understanding Deep Neural Network Predictions for Medical Imaging Applications
Narayanan, Barath Narayanan, De Silva, Manawaduge Supun, Hardie, Russell C., Kueterman, Nathan K., Ali, Redha
Computer-aided detection has been a research area attracting great interest in the past decade. Machine learning algorithms have been utilized extensively for this application as they provide a valuable second opinion to the doctors. Despite several machine learning models being available for medical imaging applications, not many have been implemented in the real-world due to the uninterpretable nature of the decisions made by the network. In this paper, we investigate the results provided by deep neural networks for the detection of malaria, diabetic retinopathy, brain tumor, and tuberculosis in different imaging modalities. We visualize the class activation mappings for all the applications in order to enhance the understanding of these networks. This type of visualization, along with the corresponding network performance metrics, would aid the data science experts in better understanding of their models as well as assisting doctors in their decision-making process.