AITopics | Country

Collaborating Authors

Country

Minimum Probability Flow Learning

Sohl-Dickstein, Jascha, Battaglino, Peter, DeWeese, Michael R.

arXiv.org Machine LearningSep-24-2011

Fitting probabilistic models to data is often difficult, due to the general intractability of the partition function and its derivatives. Here we propose a new parameter estimation technique that does not require computing an intractable normalization factor or sampling from the equilibrium distribution of the model. This is achieved by establishing dynamics that would transform the observed data distribution into the model distribution, and then setting as the objective the minimization of the KL divergence between the data distribution and the distribution produced by running the dynamics for an infinitesimal time. Score matching, minimum velocity learning, and certain forms of contrastive divergence are shown to be special cases of this learning technique. We demonstrate parameter estimation in Ising models, deep belief networks and an independent component analysis model of natural scenes. In the Ising model case, current state of the art techniques are outperformed by at least an order of magnitude in learning time, with lower error in recovered coupling parameters.

deep learning, divergence, neural network, (18 more...)

arXiv.org Machine Learning

0906.4779

Country:

North America > United States > California (0.14)
North America > Canada > Ontario > Toronto (0.14)

Genre: Research Report (0.84)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.90)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.50)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.49)

Add feedback

Proof System for Plan Verification under 0-Approximation Semantics

Zhao, Xishun, Shen, Yuping

arXiv.org Artificial IntelligenceSep-22-2011

In this paper a proof system is developed for plan verification problems $\{X\}c\{Y\}$ and $\{X\}c\{KW p\}$ under 0-approximation semantics for ${\mathcal A}_K$. Here, for a plan $c$, two sets $X,Y$ of fluent literals, and a literal $p$, $\{X\}c\{Y\}$ (resp. $\{X\}c\{KW p\}$) means that all literals of $Y$ become true (resp. $p$ becomes known) after executing $c$ in any initial state in which all literals in $X$ are true.Then, soundness and completeness are proved. The proof system allows verifying plans and generating plans as well.

artificial intelligence, logic programming, proposition, (16 more...)

arXiv.org Artificial Intelligence

1108.5943

Country:

North America > United States > Oregon (0.14)
North America > United States > California > San Francisco County > San Francisco (0.14)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Logic & Formal Reasoning (1.00)

Add feedback

Continuous Strategy Replicator Dynamics for Multi--Agent Learning

Galstyan, Aram

arXiv.org Artificial IntelligenceSep-22-2011

The problem of multi-agent learning and adaptation has attracted a great deal of attention in recent years. It has been suggested that the dynamics of multi agent learning can be studied using replicator equations from population biology. Most existing studies so far have been limited to discrete strategy spaces with a small number of available actions. In many cases, however, the choices available to agents are better characterized by continuous spectra. This paper suggests a generalization of the replicator framework that allows to study the adaptive dynamics of Q-learning agents with continuous strategy spaces. Instead of probability vectors, agents strategies are now characterized by probability measures over continuous variables. As a result, the ordinary differential equations for the discrete case are replaced by a system of coupled integral--differential replicator equations that describe the mutual evolution of individual agent strategies. We derive a set of functional equations describing the steady state of the replicator dynamics, examine their solutions for several two-player games, and confirm our analytical results using simulations.

agent, artificial intelligence, machine learning, (15 more...)

arXiv.org Artificial Intelligence

doi: 10.1007/s10458-011-9181-6

0904.4717

Country: North America > United States > California (0.14)

Genre: Research Report (1.00)

Industry: Leisure & Entertainment > Games (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Sparse Choice Models

Farias, Vivek F., Jagabathula, Srikanth, Shah, Devavrat

arXiv.org Machine LearningSep-21-2011

Choice models, which capture popular preferences over objects of interest, play a key role in making decisions whose eventual outcome is impacted by human choice behavior. In most scenarios, the choice model, which can effectively be viewed as a distribution over permutations, must be learned from observed data. The observed data, in turn, may frequently be viewed as (partial, noisy) information about marginals of this distribution over permutations. As such, the search for an appropriate choice model boils down to learning a distribution over permutations that is (near-)consistent with observed information about this distribution. In this work, we pursue a non-parametric approach which seeks to learn a choice model (i.e. a distribution over permutations) with {\em sparsest} possible support, and consistent with observed data. We assume that the data observed consists of noisy information pertaining to the marginals of the choice model we seek to learn. We establish that {\em any} choice model admits a `very' sparse approximation in the sense that there exists a choice model whose support is small relative to the dimension of the observed data and whose marginals approximately agree with the observed marginal information. We further show that under, what we dub, `signature' conditions, such a sparse approximation can be found in a computationally efficiently fashion relative to a brute force approach. An empirical study using the American Psychological Association election data-set suggests that our approach manages to unearth useful structural properties of the underlying choice model using the sparse approximation found. Our results further suggest that the signature condition is a potential alternative to the recently popularized Restricted Null Space condition for efficient recovery of sparse models.

artificial intelligence, choice model, optimization problem, (18 more...)

arXiv.org Machine Learning

1011.4339

Country:

North America > United States > New York (0.14)
North America > United States > Massachusetts (0.14)

Genre: Research Report > New Finding (0.48)

Industry: Transportation (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.48)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.46)

Add feedback

Tree Exploration for Bayesian RL Exploration

Dimitrakakis, Christos

arXiv.org Machine LearningSep-21-2011

Research in reinforcement learning has produced algorithms for optimal decision making under uncertainty that fall within two main types. The first employs a Bayesian framework, where optimality improves with increased computational time. This is because the resulting planning task takes the form of a dynamic programming problem on a belief tree with an infinite number of states. The second type employs relatively simple algorithm which are shown to suffer small regret within a distribution-free framework. This paper presents a lower bound and a high probability upper bound on the optimal value function for the nodes in the Bayesian belief tree, which are analogous to similar bounds in POMDPs. The bounds are then used to create more efficient strategies for exploring the tree. The resulting algorithms are compared with the distribution-free algorithm UCB1, as well as a simpler baseline algorithm on multi-armed bandit problems.

artificial intelligence, bayesian inference, expansion, (18 more...)

arXiv.org Machine Learning

0902.0392

Country: Europe > Switzerland (0.14)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.91)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.86)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.86)

Add feedback

Mixtures of conditional Gaussian scale mixtures applied to multiscale image representations

Theis, Lucas, Hosseini, Reshad, Bethge, Matthias

arXiv.org Machine LearningSep-20-2011

We present a probabilistic model for natural images which is based on Gaussian scale mixtures and a simple multiscale representation. In contrast to the dominant approach to modeling whole images focusing on Markov random fields, we formulate our model in terms of a directed graphical model. We show that it is able to generate images with interesting higher-order correlations when trained on natural images or samples from an occlusion based model. More importantly, the directed model enables us to perform a principled evaluation. While it is easy to generate visually appealing images, we demonstrate that our model also yields the best performance reported to date when evaluated with respect to the cross-entropy rate, a measure tightly linked to the average log-likelihood.

artificial intelligence, machine learning, representation, (14 more...)

arXiv.org Machine Learning

doi: 10.1371/journal.pone.0039857

1109.4389

Country: Europe > Germany (0.14)

Genre: Research Report (0.50)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.49)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.34)

Add feedback

Online Robust Subspace Tracking from Partial Information

He, Jun, Balzano, Laura, Lui, John C. S.

arXiv.org Machine LearningSep-20-2011

This paper presents GRASTA (Grassmannian Robust Adaptive Subspace Tracking Algorithm), an efficient and robust online algorithm for tracking subspaces from highly incomplete information. The algorithm uses a robust $l^1$-norm cost function in order to estimate and track non-stationary subspaces when the streaming data vectors are corrupted with outliers. We apply GRASTA to the problems of robust matrix completion and real-time separation of background from foreground in video. In this second application, we show that GRASTA performs high-quality separation of moving objects from background at exceptional speeds: In one popular benchmark video example, GRASTA achieves a rate of 57 frames per second, even when run in MATLAB on a personal laptop.

artificial intelligence, machine learning, subspace, (18 more...)

arXiv.org Machine Learning

1109.3827

Country: North America > United States > Wisconsin > Dane County > Madison (0.14)

Genre: Research Report (1.00)

Industry: Media (0.47)

Technology:

Information Technology > Sensing and Signal Processing (1.00)
Information Technology > Communications > Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
(2 more...)

Add feedback

VC dimension of ellipsoids

Akama, Yohji, Irie, Kei

arXiv.org Machine LearningSep-20-2011

We will establish that the VC dimension of the class of d-dimensional ellipsoids is (d^2+3d)/2, and that maximum likelihood estimate with N-component d-dimensional Gaussian mixture models induces a geometric class having VC dimension at least N(d^2+3d)/2. Keywords: VC dimension; finite dimensional ellipsoid; Gaussian mixture model

artificial intelligence, ellipsoid, machine learning, (16 more...)

arXiv.org Machine Learning

1109.4347

Country: Asia > Japan > Honshū > Tōhoku (0.14)

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (1.00)

Add feedback

Learning Discriminative Metrics via Generative Models and Kernel Learning

Shi, Yuan, Noh, Yung-Kyun, Sha, Fei, Lee, Daniel D.

arXiv.org Machine LearningSep-19-2011

Metrics specifying distances between data points can be learned in a discriminative manner or from generative models. In this paper, we show how to unify generative and discriminative learning of metrics via a kernel learning framework. Specifically, we learn local metrics optimized from parametric generative models. These are then used as base kernels to construct a global kernel that minimizes a discriminative training criterion. We consider both linear and nonlinear combinations of local metric kernels. Our empirical results show that these combinations significantly improve performance on classification tasks. The proposed learning algorithm is also very efficient, achieving order of magnitude speedup in training time compared to previous discriminative baseline methods.

artificial intelligence, machine learning, metric, (19 more...)

arXiv.org Machine Learning

1109.394

Country:

North America > United States > California (0.46)
North America > United States > Pennsylvania (0.28)

Genre: Research Report > New Finding (0.66)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Generation (0.81)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.30)

Add feedback

Convex and Network Flow Optimization for Structured Sparsity

Mairal, Julien, Jenatton, Rodolphe, Obozinski, Guillaume, Bach, Francis

arXiv.org Machine LearningSep-16-2011

We consider a class of learning problems regularized by a structured sparsity-inducing norm defined as the sum of l_2- or l_infinity-norms over groups of variables. Whereas much effort has been put in developing fast optimization techniques when the groups are disjoint or embedded in a hierarchy, we address here the case of general overlapping groups. To this end, we present two different strategies: On the one hand, we show that the proximal operator associated with a sum of l_infinity-norms can be computed exactly in polynomial time by solving a quadratic min-cost flow problem, allowing the use of accelerated proximal gradient methods. On the other hand, we use proximal splitting techniques, and address an equivalent formulation with non-overlapping groups, but in higher dimension and with additional constraints. We propose efficient and scalable algorithms exploiting these two strategies, which are significantly faster than alternative approaches. We illustrate these methods with several problems such as CUR matrix factorization, multi-task learning of tree-structured dictionaries, background subtraction in video sequences, image denoising with wavelets, and topographic dictionary learning of natural image patches.

algorithm, health & medicine, optimization problem, (16 more...)

arXiv.org Machine Learning

1104.1872

Country: North America > United States > California > Alameda County > Berkeley (0.14)

Genre: Research Report (0.82)

Industry:

Health & Medicine (0.45)
Education (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback