Country
RadioUNet: Fast Radio Map Estimation with Convolutional Neural Networks
Levie, Ron, Yapar, Çağkan, Kutyniok, Gitta, Caire, Giuseppe
In this paper we propose a highly efficient and very accurate method for estimating the propagation pathloss from a point x to all points y on the 2D plane. Our method, termed RadioUNet, is a deep neural network. For applications such as user-cell site association and device-to-device (D2D) link scheduling, an accurate knowledge of the pathloss function for all pairs of locations is very important. Commonly used statistical models approximate the pathloss as a decaying function of the distance between the points. However, in realistic propagation environments characterized by the presence of buildings, street canyons, and objects at different heights, such radial-symmetric functions yield very misleading results. In this paper we show that properly designed and trained deep neural networks are able to learn how to estimate the pathloss function, given an urban environment, very accurately and extremely quickly. Our proposed method generates pathloss estimations that are very close to estimations given by physical simulation, but much faster. Moreover, experimental results show that our method significantly outperforms previously proposed methods based on radial basis function interpolation and tensor completion.
Solving machine learning optimization problems using quantum computers
Dasari, Venkat R., Im, Mee Seong, Beshaj, Lubjana
Classical optimization algorithms in machine learning often take a long time to compute when applied to a multi-dimensional problem and require a huge amount of CPU and GPU resource. Quantum parallelism has a potential to speed up machine learning algorithms. We describe a generic mathematical model to leverage quantum parallelism to speed-up machine learning algorithms. We also apply quantum machine learning and quantum parallelism applied to a $3$-dimensional image that vary with time.
Multi-domain Conversation Quality Evaluation via User Satisfaction Estimation
Bodigutla, Praveen Kumar, Polymenakos, Lazaros, Matsoukas, Spyros
An automated metric to evaluate dialogue quality is vital for optimizing data driven dialogue management. The common approach of relying on explicit user feedback during a conversation is intrusive and sparse. Current models to estimate user satisfaction use limited feature sets and employ annotation schemes with limited generalizability to conversations spanning multiple domains. To address these gaps, we created a new Response Quality annotation scheme, introduced five new domain-independent feature sets and experimented with six machine learning models to estimate User Satisfaction at both turn and dialogue level. Response Quality ratings achieved significantly high correlation (0.76) with explicit turn-level user ratings. Using the new feature sets we introduced, Gradient Boosting Regression model achieved best (rating [1-5]) prediction performance on 26 seen (linear correlation ~0.79) and one new multi-turn domain (linear correlation 0.67). We observed a 16% relative improvement (68% -> 79%) in binary ("satisfactory/dissatisfactory") class prediction accuracy of a domain-independent dialogue-level satisfaction estimation model after including predicted turn-level satisfaction ratings as features.
Exploiting Human Social Cognition for the Detection of Fake and Fraudulent Faces via Memory Networks
Fernando, Tharindu, Fookes, Clinton, Denman, Simon, Sridharan, Sridha
Advances in computer vision have brought us to the point where we have the ability to synthesise realistic fake content. Such approaches are seen as a source of disinformation and mistrust, and pose serious concerns to governments around the world. Convolutional Neural Networks (CNNs) demonstrate encouraging results when detecting fake images that arise from the specific type of manipulation they are trained on. However, this success has not transitioned to unseen manipulation types, resulting in a significant gap in the line-of-defense. We propose a Hierarchical Memory Network (HMN) architecture, which is able to successfully detect faked faces by utilising knowledge stored in neural memories as well as visual cues to reason about the perceived face and anticipate its future semantic embeddings. This renders a generalisable face tampering detection framework. Experimental results demonstrate the proposed approach achieves superior performance for fake and fraudulent face detection compared to the state-of-the-art.
RotationOut as a Regularization Method for Neural Network
A BSTRACT In this paper, we propose a novel regularization method, RotationOut, for neural networks. Different from Dropout that handles each neuron/channel independently, RotationOut regards its input layer as an entire vector and introduces regularization by randomly rotating the vector. RotationOut can also be used in convolutional layers and recurrent layers with small modifications. We further use a noise analysis method to interpret the difference between RotationOut and Dropout in co-adaptation reduction. Using this method, we also show how to use RotationOut/Dropout together with Batch Normalization. Extensive experiments in vision and language tasks are conducted to show the effectiveness of the proposed method. Codes are available at https://github.com/KaiHoo/ RotationOut . 1 I NTRODUCTION Dropout (Srivastava et al., 2014) has proven to be effective for preventing overfitting over many deep learning areas, such as image classification (Shrivastava et al., 2017), natural language processing (Hu et al., 2016) and speech recognition (Amodei et al., 2016). In the years since, a wide range of variants have been proposed for wider scenarios, and most related work focus on the improvement of Dropout structures, i.e., how to drop. For example, drop connect (Wan et al., 2013) drops the weights instead of neurons, evolutional dropout (Li et al., 2016) computes the adaptive dropping probabilities on-the-fly, max-pooling dropout (Wu & Gu, 2015) drops neurons in the max-pooling kernel so smaller feature values have some probabilities to to affect the activations. These Dropout-like methods process each neuron/channel in one layer independently and introduce randomness by dropping. These architectures are certainly simple and effective. However, randomly dropping independently is not the only method to introduce randomness. Hinton et al. (2012) argues that overfitting can be reduced by preventing co-adaptation between feature detectors. Thus it is helpful to consider other neurons' information when adding noise to one neuron. For example, lateral inhibition noise could be more effective than independent noise.
Grassmannian Packings in Neural Networks: Learning with Maximal Subspace Packings for Diversity and Anti-Sparsity
Yap, Dian Ang, Roberts, Nicholas, Prabhu, Vinay Uday
Kernel sparsity ("dying ReLUs") and lack of diversity are commonly observed in CNN kernels, which decreases model capacity. Drawing inspiration from information theory and wireless communications, we demonstrate the intersection of coding theory and deep learning through the Grassmannian subspace packing problem in CNNs. We propose Grassmannian packings for initial kernel layers to be initialized maximally far apart based on chordal or Fubini-Study distance. Convolutional kernels initialized with Grassmannian packings exhibit diverse features and obtain diverse representations. We show that Grassmannian packings, especially in the initial layers, address kernel sparsity and encourage diversity, while improving classification accuracy across shallow and deep CNNs with better convergence rates.
Provable Filter Pruning for Efficient Neural Networks
Liebenwein, Lucas, Baykal, Cenk, Lang, Harry, Feldman, Dan, Rus, Daniela
We present a provable, sampling-based approach for generating compact Convolutional Neural Networks (CNNs) by identifying and removing redundant filters from an over-parameterized network. Our algorithm uses a small batch of input data points to assign a saliency score to each filter and constructs an importance sampling distribution where filters that highly affect the output are sampled with correspondingly high probability. In contrast to existing filter pruning approaches, our method is simultaneously data-informed, exhibits provable guarantees on the size and performance of the pruned network, and is widely applicable to varying network architectures and data sets. Our analytical bounds bridge the notions of compressibility and importance of network structures, which gives rise to a fully-automated procedure for identifying and preserving filters in layers that are essential to the network's performance. Our experimental evaluations on popular architectures and data sets show that our algorithm consistently generates sparser and more efficient models than those constructed by existing filter pruning approaches.
Online Learning and Matching for Resource Allocation Problems
Boskovic, Andrea, Chen, Qinyi, Kufel, Dominik, Zhou, Zijie
In order for an e-commerce platform to maximize its revenue, it must recommend customers items they are most likely to purchase. However, the company often has business constraints on these items, such as the number of each item in stock. In this work, our goal is to recommend items to users as they arrive on a webpage sequentially, in an online manner, in order to maximize reward for a company, but also satisfy budget constraints. We first approach the simpler online problem in which the customers arrive as a stationary Poisson process, and present an integrated algorithm that performs online optimization and online learning together. We then make the model more complicated but more realistic, treating the arrival processes as non-stationary Poisson processes. To deal with heterogeneous customer arrivals, we propose a time segmentation algorithm that converts a non-stationary problem into a series of stationary problems. Experiments conducted on large-scale synthetic data demonstrate the effectiveness and efficiency of our proposed approaches on solving constrained resource allocation problems.
Justification-Based Reliability in Machine Learning
Virani, Nurali, Iyer, Naresh, Yang, Zhaoyuan
With the advent of Deep Learning, the field of machine learning (ML) has surpassed human-level performance on diverse classification tasks. At the same time, there is a stark need to characterize and quantify reliability of a model's prediction on individual samples. This is especially true in application of such models in safety-critical domains of industrial control and healthcare. To address this need, we link the question of reliability of a model's individual prediction to the epistemic uncertainty of the model's prediction. More specifically, we extend the theory of Justified True Belief (JTB) in epistemology, created to study the validity and limits of human-acquired knowledge, towards characterizing the validity and limits of knowledge in supervised classifiers. We present an analysis of neural network classifiers linking the reliability of its prediction on an input to characteristics of the support gathered from the input and latent spaces of the network. We hypothesize that the JTB analysis exposes the epistemic uncertainty (or ignorance) of a model with respect to its inference, thereby allowing for the inference to be only as strong as the justification permits. We explore various forms of support (for e.g., k-nearest neighbors (k-NN) and l_p-norm based) generated for an input, using the training data to construct a justification for the prediction with that input. Through experiments conducted on simulated and real datasets, we demonstrate that our approach can provide reliability for individual predictions and characterize regions where such reliability cannot be ascertained.
Predicting colorectal polyp recurrence using time-to-event analysis of medical records
Harrington, Lia X., Wei, Jason W., Suriawinata, Arief A., Mackenzie, Todd A., Hassanpour, Saeed
Identifying patient characteristics that influence the rate of colorectal polyp recurrence can provide important insights into which patients are at higher risk for recurrence. We used natural language processing to extract polyp morphological characteristics from 953 polyp-presenting patients' electronic medical records. We used subsequent colonoscopy reports to examine how the time to polyp recurrence (731 patients experienced recurrence) is influenced by these characteristics as well as anthropometric features using Kaplan-Meier curves, Cox proportional hazards modeling, and random survival forest models. We found that the rate of recurrence differed significantly by polyp size, number, and location and patient smoking status. Additionally, right-sided colon polyps increased recurrence risk by 30% compared to left-sided polyps. History of tobacco use increased polyp recurrence risk by 20% compared to never-users. A random survival forest model showed an AUC of 0.65 and identified several other predictive variables, which can inform development of personalized polyp surveillance plans.