AITopics

We provide provably privacy-preserving versions of belief propagation, Gibbs sampling, and other local algorithms -- distributed multiparty protocols in which each party or vertex learns only its final local value, and absolutely nothing else.

compute, neighbor, protocol, (16 more...)

Country:

North America > United States > Pennsylvania > Philadelphia County > Philadelphia (0.14)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Industry: Information Technology > Security & Privacy (0.94)

Technology:

Information Technology > Data Science > Data Mining > Big Data (0.63)
Information Technology > Artificial Intelligence > Representation & Reasoning > Belief Revision (0.63)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.35)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.35)

Hutter, Marcus, Legg, Shane

Temporal Difference Updating without a Learning Rate

We derive an equation for temporal difference learning from statistical principles. Specifically, we start with the variational principle and then bootstrap to produce an updating rule for discounted state value estimates. The resulting equation is similar to the standard equation for temporal difference learning with eligibility traces, so called TD(λ), however it lacks the parameter α that specifies the learning rate. In the place of this free parameter there is now an equation for the learning rate that is specific to each state transition. We experimentally test this new learning rule against TD(λ) and find that it offers superior performance in various settings. Finally, we make some preliminary investigations into how to extend our new temporal difference algorithm to reinforcement learning. To do this we combine our update equation with both Watkins' Q(λ) and Sarsa(λ) and find that it again offers superior performance without a learning rate parameter.

markov process, sarsa, temporal difference, (13 more...)

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Oceania > Australia > Australian Capital Territory > Canberra (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
Europe > Switzerland (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Huang, Jonathan, Guestrin, Carlos, Guibas, Leonidas J.

Efficient Inference for Distributions on Permutations

Permutations are ubiquitous in many real world problems, such as voting, rankings and data association. Representing uncertainty over permutations is challenging, since there are n! possibilities, and typical compact representations such as graphical models cannot efficiently capture the mutual exclusivity constraints associated with permutations. In this paper, we use the "low-frequency" terms of a Fourier decomposition to represent such distributions compactly.

coefficient, fourier transform, representation, (15 more...)

Country:

North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
North America > United States > California > Santa Clara County > Palo Alto (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)

Howard, Andrew, Jebara, Tony

Learning Monotonic Transformations for Classification

A discriminative method is proposed for learning monotonic transformations of the training data while jointly estimating a large-margin classifier. In many domains such as document classification, image histogram classification and gene microarray experiments, fixed monotonic transformations can be useful as a preprocessing step. However, most classifiers only explore these transformations through manual trial and error or via prior domain knowledge. The proposed method learns monotonic transformations automatically while training a large-margin classifier without any prior knowledge of the domain. A monotonic piecewise linear function is learned which transforms data for subsequent processing by a linear hyperplane classifier. Two algorithmic implementations of the method are formalized. The first solves a convergent alternating sequence of quadratic and linear programs until it obtains a locally optimal solution. An improved algorithm is then derived using a convex semidefinite relaxation that overcomes initialization issues in the greedy optimization problem. The effectiveness of these learned transformations on synthetic problems, text data and image data is demonstrated.

convex relaxation, relaxation, transformation, (12 more...)

Country:

North America > United States > New York > New York County > New York City (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > Middle East > Jordan (0.04)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.35)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.33)

Hoffman, Matthew, Doucet, Arnaud, Freitas, Nando D., Jasra, Ajay

Bayesian Policy Learning with Trans-Dimensional MCMC

A recently proposed formulation of the stochastic planning and control problem as one of parameter estimation for suitable artificial statistical models has led to the adoption of inference algorithms for this notoriously hard problem. At the algorithmic level, the focus has been on developing Expectation-Maximization (EM) algorithms. In this paper, we begin by making the crucial observation that the stochastic control problem can be reinterpreted as one of trans-dimensional inference. With this new interpretation, we are able to propose a novel reversible jump Markov chain Monte Carlo (MCMC) algorithm that is more efficient than its EM counterparts. Moreover, it enables us to implement full Bayesian policy search, without the need for gradients and with one single Markov chain. The new approach involves sampling directly from a distribution that is proportional to the reward and, consequently, performs better than classic simulations methods in situations where the reward is a rare event.

algorithm, mcmc, probability distribution, (14 more...)

Country:

North America > Canada > British Columbia (0.05)
Asia > Middle East > Jordan (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)

He, Jingrui, Carbonell, Jaime G.

Nearest-Neighbor-Based Active Learning for Rare Category Detection

Rare category detection is an open challenge for active learning, especially in the de-novo case (no labeled examples), but of significant practical importance for data mining - e.g.

local density, majority class, minority class, (12 more...)

Country:

North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Case-Based Reasoning (0.42)

A Kernel Statistical Test of Independence

Gretton, Arthur, Fukumizu, Kenji, Teo, Choon H., Song, Le, Schölkopf, Bernhard, Smola, Alex J.

Although kernel measures of independence have been widely applied in machine learning (notably in kernel ICA), there is as yet no method to determine whether they have detected statistically significant dependence. We provide a novel test of the independence hypothesis for one particular kernel independence measure, the Hilbert-Schmidt independence criterion (HSIC).

dependence, independence, kernel, (15 more...)

Country:

Europe > Germany > Baden-Württemberg > Tübingen Region > Tübingen (0.14)
Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.14)
North America > United States > New York (0.05)
(4 more...)

Genre: Research Report (0.68)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.94)

Ganchev, Kuzman, Taskar, Ben, Gama, João

Expectation Maximization and Posterior Constraints

The expectation maximization (EM) algorithm is a widely used maximum likelihood estimation procedure for statistical models when the values of some of the variables in the model are not observed. Very often, however, our aim is primarily to find a model that assigns values to the latent variables that have intended meaning for our data and maximizing expected likelihood only sometimes accomplishes this. Unfortunately, it is typically difficult to add even simple a-priori information about latent variables in graphical models without making the models overly complex or intractable. In this paper, we present an efficient, principled way to inject rich constraints on the posteriors of latent variables into the EM algorithm. Our method can be used to learn tractable graphical models that satisfy additional, otherwise intractable constraints. Focusing on clustering and the alignment problem for statistical machine translation, we show that simple, intuitive posterior constraints can greatly improve the performance over standard baselines and be competitive with more complex, intractable models.

alignment, constraint, posterior, (15 more...)

Country:

North America > United States > Pennsylvania > Philadelphia County > Philadelphia (0.04)
Asia > Middle East > Jordan (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.69)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.68)

Gerwinn, Sebastian, Bethge, Matthias, Macke, Jakob H., Seeger, Matthias

Bayesian Inference for Spiking Neuron Models with a Sparsity Prior

Generalized linear models are the most commonly used tools to describe the stimulus selectivity of sensory neurons. Here we present a Bayesian treatment of such models. Using the expectation propagation algorithm, we are able to approximate the full posterior distribution over all weights. In addition, we use a Laplacian prior to favor sparse solutions. Therefore, stimulus features that do not critically influence neural activity will be assigned zero weights and thus be effectively excluded by the model.

glm neuron model, neuron, neuron model, (16 more...)

Country:

Europe > Germany > Baden-Württemberg > Tübingen Region > Tübingen (0.05)
North America > United States > California (0.04)

Industry: Health & Medicine > Therapeutic Area > Neurology (0.96)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.66)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)

Fukumizu, Kenji, Gretton, Arthur, Sun, Xiaohai, Schölkopf, Bernhard

Kernel Measures of Conditional Dependence

We propose a new measure of conditional dependence of random variables, based on normalized cross-covariance operators on reproducing kernel Hilbert spaces. Unlike previous kernel dependence measures, the proposed criterion does not depend on the choice of kernel in the limit of infinite data, for a wide class of kernels. At the same time, it has a straightforward empirical estimate with good convergence behaviour. We discuss the theoretical properties of the measure, and demonstrate its application in experiments.

independence, kernel, operator, (15 more...)

Country:

Europe > Germany > Baden-Württemberg > Tübingen Region > Tübingen (0.14)
Asia > Middle East > Jordan (0.05)
North America > United States > New York (0.04)
(2 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)