Goto

Collaborating Authors

 Country


Field Label Prediction for Autofill in Web Browsers

arXiv.org Machine Learning

Automatic form fill is an important productivity related feature present in major web browsers, which predicts the field labels of a web form and automatically fills values in a new form based on the values previously filled for the same field in other forms. This feature increases the convenience and efficiency of users who have to fill similar information in fields in multiple forms. In this paper we describe a machine learning solution for predicting the form field labels, implemented as a web service using Azure ML Studio.


Bayesian Topological Learning for Brain State Classification

arXiv.org Machine Learning

Investigation of human brain states through electroencephalograph (EEG) signals is a crucial step in human-machine communications. However, classifying and analyzing EEG signals are challenging due to their noisy, nonlinear and nonstationary nature. Current methodologies for analyzing these signals often fall short because they have several regularity assumptions baked in. This work provides an effective, flexible and noise-resilient scheme to analyze EEG by extracting pertinent information while abiding by the 3N (noisy, nonlinear and nonstationary) nature of data. We implement a topological tool, namely persistent homology, that tracks the evolution of topological features over time intervals and incorporates individual's expectations as prior knowledge by means of a Bayesian framework to compute posterior distributions. Relying on these posterior distributions, we apply Bayes factor classification to noisy EEG measurements. The performance of this Bayesian classification scheme is then compared with other existing methods for EEG signals.


Finite-Time Convergence of Continuous-Time Optimization Algorithms via Differential Inclusions

arXiv.org Machine Learning

In this paper, we propose two discontinuous dynamical systems in continuous time with guaranteed prescribed finite-time local convergence to strict local minima of a given cost function. Our approach consists of exploiting a Lyapunov-based differential inequality for differential inclusions, which leads to finite-time stability and thus finite-time convergence with a provable bound on the settling time. In particular, for exact solutions to the aforementioned differential inequality, the settling-time bound is also exact, thus achieving prescribed finite-time convergence. We thus construct a class of discontinuous dynamical systems, of second order with respect to the cost function, that serve as continuous-time optimization algorithms with finite-time convergence and prescribed convergence time. Finally, we illustrate our results on the Rosenbrock function.


Kernel-Based Ensemble Learning in Python

arXiv.org Machine Learning

We propose a new supervised learning algorithm, for classification and regression problems where two or more preliminary predictors are available. We introduce \texttt{KernelCobra}, a non-linear learning strategy for combining an arbitrary number of initial predictors. \texttt{KernelCobra} builds on the COBRA algorithm introduced by \citet{biau2016cobra}, which combined estimators based on a notion of proximity of predictions on the training data. While the COBRA algorithm used a binary threshold to declare which training data were close and to be used, we generalize this idea by using a kernel to better encapsulate the proximity information. Such a smoothing kernel provides more representative weights to each of the training points which are used to build the aggregate and final predictor, and \texttt{KernelCobra} systematically outperforms the COBRA algorithm. While COBRA is intended for regression, \texttt{KernelCobra} deals with classification and regression. \texttt{KernelCobra} is included as part of the open source Python package \texttt{Pycobra} (0.2.4 and onward), introduced by \citet{guedj2018pycobra}. Numerical experiments assess the performance (in terms of pure prediction and computational complexity) of \texttt{KernelCobra} on real-life and synthetic datasets.


Lower Memory Oblivious (Tensor) Subspace Embeddings with Fewer Random Bits: Modewise Methods for Least Squares

arXiv.org Machine Learning

In this paper new general modewise Johnson-Lindenstrauss (JL) subspace embeddings are proposed that are both considerably faster to generate and easier to store than traditional JL embeddings when working with extremely large vectors and/or tensors. Corresponding embedding results are then proven for two different types of low-dimensional (tensor) subspaces. The first of these new subspace embedding results produces improved space complexity bounds for embeddings of rank-$r$ tensors whose CP decompositions are contained in the span of a fixed (but unknown) set of $r$ rank-one basis tensors. In the traditional vector setting this first result yields new and very general near-optimal oblivious subspace embedding constructions that require fewer random bits to generate than standard JL embeddings when embedding subspaces of $\mathbb{C}^N$ spanned by basis vectors with special Kronecker structure. The second result proven herein provides new fast JL embeddings of arbitrary $r$-dimensional subspaces $\mathcal{S} \subset \mathbb{C}^N$ which also require fewer random bits (and so are easier to store - i.e., require less space) than standard fast JL embedding methods in order to achieve small $\epsilon$-distortions. These new oblivious subspace embedding results work by $(i)$ effectively folding any given vector in $\mathcal{S}$ into a (not necessarily low-rank) tensor, and then $(ii)$ embedding the resulting tensor into $\mathbb{C}^m$ for $m \leq C r \log^c(N) / \epsilon^2$. Applications related to compression and fast compressed least squares solution methods are also considered, including those used for fitting low-rank CP decompositions, and the proposed JL embedding results are shown to work well numerically in both settings.


On the Bias-Variance Tradeoff: Textbooks Need an Update

arXiv.org Machine Learning

The main goal of this thesis is to point out that the bias-variance tradeoff is not always true (e.g. in neural networks). We advocate for this lack of universality to be acknowledged in textbooks and taught in introductory courses that cover the tradeoff. We first review the history of the bias-variance tradeoff, its prevalence in textbooks, and some of the main claims made about the bias-variance tradeoff. Through extensive experiments and analysis, we show a lack of a bias-variance tradeoff in neural networks when increasing network width. Our findings seem to contradict the claims of the landmark work by Geman et al. (1992). Motivated by this contradiction, we revisit the experimental measurements in Geman et al. (1992). We discuss that there was never strong evidence for a tradeoff in neural networks when varying the number of parameters. We observe a similar phenomenon beyond supervised learning, with a set of deep reinforcement learning experiments. We argue that textbook and lecture revisions are in order to convey this nuanced modern understanding of the bias-variance tradeoff.


Transfer learning in hybrid classical-quantum neural networks

arXiv.org Machine Learning

Transfer learning is a typical example of an artificial intelligence technique that has been originally inspired by biological intelligence. It originates from the simple observation that the knowledge acquired in a specific context can be transferred to a different area. For example, when we learn a second language we do not start from scratch, but we make use of our previous linguistic knowledge. Sometimes transfer learning is the only way to approach complex cognitive tasks, e.g., before learning quantum mechanics it is advisable to first study linear algebra. This general idea has been successfully applied also to design artificial neural networks [1-3]. It has been shown [4, 5] that in many situations, instead of training a full network from scratch, it is more efficient to start from a pre-trained deep network and then optimize only some of the final layers for a particular task and dataset of interest (see Figure 1).


Extrinsic Kernel Ridge Regression Classifier for Planar Kendall Shape Space

arXiv.org Machine Learning

Kernel methods have had great success in the statistics and machine learning community. Despite their growing popularity, however, less effort has been drawn towards developing kernel based classification methods on manifold due to the non-Euclidean geometry. In this paper, motivated by the extrinsic framework of manifold-valued data analysis, we propose two types of new kernels on planar Kendall shape space $\Sigma_2^k$, called extrinsic Veronese Whitney Gaussian kernel and extrinsic complex Gaussian kernel. We show that our approach can be extended to develop Gaussian like kernels on any embedded manifold. Furthermore, kernel ridge regression classifier (KRRC) is implemented to address the shape classification problem on $\Sigma_2^k$, and their promising performances are illustrated through the real dataset.


Improved Surrogates in Inertial Confinement Fusion with Manifold and Cycle Consistencies

arXiv.org Machine Learning

Neural networks have become very popular in surrogate modeling because of their ability to characterize arbitrary, high dimensional functions in a data driven fashion. This paper advocates for the training of surrogates that are consistent with the physical manifold -- i.e., predictions are always physically meaningful, and are cyclically consistent -- i.e., when the predictions of the surrogate, when passed through an independently trained inverse model give back the original input parameters. We find that these two consistencies lead to surrogates that are superior in terms of predictive performance, more resilient to sampling artifacts, and tend to be more data efficient. Using Inertial Confinement Fusion (ICF) as a test bed problem, we model a 1D semi-analytic numerical simulator and demonstrate the effectiveness of our approach. Code and data are available at https://github.com/rushilanirudh/macc/


HCNAF: Hyper-Conditioned Neural Autoregressive Flow and its Application for Probabilistic Occupancy Map Forecasting

arXiv.org Machine Learning

W e introduce Hyper-Conditioned Neural Autoregres-sive Flow (HCNAF); a powerful universal distribution ap-proximator designed to model arbitrarily complex conditional probability density functions. HCNAF consists of a neural-net based conditional autoregressive flow (AF) and a hyper-network that can take large conditions in non-autoregressive fashion and outputs the network parameters of the AF . Like other flow models, HCNAF performs exact likelihood inference. W e demonstrate the effectiveness and attributes of HCNAF, including its generalization capability over unseen conditions and show that HCNAF outperforms recent flow models in a conditional density estimation task for MNIST. W e also show that HCNAF scales up to complex high-dimensional prediction problems of the magnitude of self-driving and that HCNAF yields a state-of-the-art performance in a public self-driving dataset.