Hierarchical Optimistic Region Selection driven by Curiosity

Neural Information Processing Systems

This paper aims to take a step forwards making the term intrinsic motivation'' from reinforcement learning theoretically well founded, focusing on curiosity-driven learning. To that end, we consider the setting where, a fixed partition P of a continuous space X being given, and a process u defined on X being unknown, we are asked to sequentially decide which cell of the partition to select as well as where to sample u in that cell, in order to minimize a loss function that is inspired from previous work on curiosity-driven learning. The loss on each cell consists of one term measuring a simple worst case quadratic sampling error, and a penalty term proportional to the range of the variance in that cell. The corresponding problem formulation extends the setting known as active learning for multi-armed bandits to the case when each arm is a continuous region, and we show how an adaptation of recent algorithms for that problem and of hierarchical optimistic sampling algorithms for optimization can be used in order to solve this problem. The resulting procedure, called Hierarchical Optimistic Region SElection driven by Curiosity (HORSE.C) is provided together with a finite-time regret analysis.


Monte Carlo Sampling for Regret Minimization in Extensive Games

Neural Information Processing Systems

Sequential decision-making with multiple agents and imperfect information is commonly modeled as an extensive game. One efficient method for computing Nash equilibria in large, zero-sum, imperfect information games is counterfactual regret minimization (CFR). In the domain of poker, CFR has proven effective, particularly when using a domain-specific augmentation involving chance outcome sampling. In this paper, we describe a general family of domain independent CFR sample-based algorithms called Monte Carlo counterfactual regret minimization (MCCFR) of which the original and poker-specific versions are special cases. We start by showing that MCCFR performs the same regret updates as CFR on expectation.


Q-MKL: Matrix-induced Regularization in Multi-Kernel Learning with Applications to Neuroimaging

Neural Information Processing Systems

Multiple Kernel Learning (MKL) generalizes SVMs to the setting where one simultaneously trains a linear classifier and chooses an optimal combination of given base kernels. Model complexity is typically controlled using various norm regularizations on the vector of base kernel mixing coefficients. Existing methods, however, neither regularize nor exploit potentially useful information pertaining to how kernels in the input set'interact'; that is, higher order kernel-pair relationships that can be easily obtained via unsupervised (similarity, geodesics), supervised (correlation in errors), or domain knowledge driven mechanisms (which features were used to construct the kernel?). We show that by substituting the norm penalty with an arbitrary quadratic function Q \succeq 0, one can impose a desired covariance structure on mixing coefficient selection, and use this as an inductive bias when learning the concept. This formulation significantly generalizes the widely used 1- and 2-norm MKL objectives.


MDPs with Non-Deterministic Policies

Neural Information Processing Systems

Markov Decision Processes (MDPs) have been extensively studied and used in the context of planning and decision-making, and many methods exist to find the optimal policy for problems modelled as MDPs. Although finding the optimal policy is sufficient in many domains, in certain applications such as decision support systems where the policy is executed by a human (rather than a machine), finding all possible near-optimal policies might be useful as it provides more flexibility to the person executing the policy. In this paper we introduce the new concept of non-deterministic MDP policies, and address the question of finding near-optimal non-deterministic policies. We propose two solutions to this problem, one based on a Mixed Integer Program and the other one based on a search algorithm. We include experimental results obtained from applying this framework to optimize treatment choices in the context of a medical decision support system.


Scaled Gradients on Grassmann Manifolds for Matrix Completion

Neural Information Processing Systems

This paper describes gradient methods based on a scaled metric on the Grassmann manifold for low-rank matrix completion. The proposed methods significantly improve canonical gradient methods especially on ill-conditioned matrices, while maintaining established global convegence and exact recovery guarantees. A connection between a form of subspace iteration for matrix completion and the scaled gradient descent procedure is also established. The proposed conjugate gradient method based on the scaled gradient outperforms several existing algorithms for matrix completion and is competitive with recently proposed methods. Papers published at the Neural Information Processing Systems Conference.


Bayesian Probabilistic Co-Subspace Addition

Neural Information Processing Systems

For modeling data matrices, this paper introduces Probabilistic Co-Subspace Addition (PCSA) model by simultaneously capturing the dependent structures among both rows and columns. Briefly, PCSA assumes that each entry of a matrix is generated by the additive combination of the linear mappings of two features, which distribute in the row-wise and column-wise latent subspaces. Consequently, it captures the dependencies among entries intricately, and is able to model the non-Gaussian and heteroscedastic density. Variational inference is proposed on PCSA for approximate Bayesian learning, where the updating for posteriors is formulated into the problem of solving Sylvester equations. Furthermore, PCSA is extended to tackling and filling missing values, to adapting its sparseness, and to modelling tensor data.


r/MachineLearning - [R] Faster AutoAugment: Learning Augmentation Strategies using Backpropagation

#artificialintelligence

Abstract: Data augmentation methods are indispensable heuristics to boost the performance of deep neural networks, especially in image recognition tasks. Recently, several studies have shown that augmentation strategies found by search algorithms outperform hand-made strategies. Such methods employ black-box search algorithms over image transformations with continuous or discrete parameters and require a long time to obtain better strategies. In this paper, we propose a differentiable policy search pipeline for data augmentation, which is much faster than previous methods. We introduce approximate gradients for several transformation operations with discrete parameters as well as the differentiable mechanism for selecting operations.


3 Ways Machine Learning Is Transforming CRMs for the Better

#artificialintelligence

Making decisions based on your collected data is crucial to providing an excellent customer experience and keeping your customers happy. On that note, you've probably heard that machine learning and AI are the future of analytics. If you've ever been confused about what machine learning is and what it can do for you, we're here to help by explaining three different ways ML is transforming CRMs to better help businesses leverage their data. At its core, machine learning is a category of AI that aims to extract knowledge or patterns from a series of observations. Think of it like a person who looks at collected data and provides you with insight and analytics based on the data, while also learning and adapting with each set of data.