artificial intelligence

Proof System for Plan Verification under 0-Approximation Semantics

In this paper a proof system is developed for plan verification problems $\{X\}c\{Y\}$ and $\{X\}c\{KW p\}$ under 0-approximation semantics for ${\mathcal A}_K$. Here, for a plan $c$, two sets $X,Y$ of fluent literals, and a literal $p$, $\{X\}c\{Y\}$ (resp. $\{X\}c\{KW p\}$) means that all literals of $Y$ become true (resp. $p$ becomes known) after executing $c$ in any initial state in which all literals in $X$ are true.Then, soundness and completeness are proved. The proof system allows verifying plans and generating plans as well.

Continuous Strategy Replicator Dynamics for Multi--Agent Learning

The problem of multi-agent learning and adaptation has attracted a great deal of attention in recent years. It has been suggested that the dynamics of multi agent learning can be studied using replicator equations from population biology. Most existing studies so far have been limited to discrete strategy spaces with a small number of available actions. In many cases, however, the choices available to agents are better characterized by continuous spectra. This paper suggests a generalization of the replicator framework that allows to study the adaptive dynamics of Q-learning agents with continuous strategy spaces. Instead of probability vectors, agents strategies are now characterized by probability measures over continuous variables. As a result, the ordinary differential equations for the discrete case are replaced by a system of coupled integral--differential replicator equations that describe the mutual evolution of individual agent strategies. We derive a set of functional equations describing the steady state of the replicator dynamics, examine their solutions for several two-player games, and confirm our analytical results using simulations.

Sparse Choice Models

Choice models, which capture popular preferences over objects of interest, play a key role in making decisions whose eventual outcome is impacted by human choice behavior. In most scenarios, the choice model, which can effectively be viewed as a distribution over permutations, must be learned from observed data. The observed data, in turn, may frequently be viewed as (partial, noisy) information about marginals of this distribution over permutations. As such, the search for an appropriate choice model boils down to learning a distribution over permutations that is (near-)consistent with observed information about this distribution. In this work, we pursue a non-parametric approach which seeks to learn a choice model (i.e. a distribution over permutations) with {\em sparsest} possible support, and consistent with observed data. We assume that the data observed consists of noisy information pertaining to the marginals of the choice model we seek to learn. We establish that {\em any} choice model admits a very' sparse approximation in the sense that there exists a choice model whose support is small relative to the dimension of the observed data and whose marginals approximately agree with the observed marginal information. We further show that under, what we dub, signature' conditions, such a sparse approximation can be found in a computationally efficiently fashion relative to a brute force approach. An empirical study using the American Psychological Association election data-set suggests that our approach manages to unearth useful structural properties of the underlying choice model using the sparse approximation found. Our results further suggest that the signature condition is a potential alternative to the recently popularized Restricted Null Space condition for efficient recovery of sparse models.

Tree Exploration for Bayesian RL Exploration

Research in reinforcement learning has produced algorithms for optimal decision making under uncertainty that fall within two main types. The first employs a Bayesian framework, where optimality improves with increased computational time. This is because the resulting planning task takes the form of a dynamic programming problem on a belief tree with an infinite number of states. The second type employs relatively simple algorithm which are shown to suffer small regret within a distribution-free framework. This paper presents a lower bound and a high probability upper bound on the optimal value function for the nodes in the Bayesian belief tree, which are analogous to similar bounds in POMDPs. The bounds are then used to create more efficient strategies for exploring the tree. The resulting algorithms are compared with the distribution-free algorithm UCB1, as well as a simpler baseline algorithm on multi-armed bandit problems.

VC dimension of ellipsoids

We will establish that the VC dimension of the class of d-dimensional ellipsoids is (d^2+3d)/2, and that maximum likelihood estimate with N-component d-dimensional Gaussian mixture models induces a geometric class having VC dimension at least N(d^2+3d)/2. Keywords: VC dimension; finite dimensional ellipsoid; Gaussian mixture model

Online Robust Subspace Tracking from Partial Information

This paper presents GRASTA (Grassmannian Robust Adaptive Subspace Tracking Algorithm), an efficient and robust online algorithm for tracking subspaces from highly incomplete information. The algorithm uses a robust $l^1$-norm cost function in order to estimate and track non-stationary subspaces when the streaming data vectors are corrupted with outliers. We apply GRASTA to the problems of robust matrix completion and real-time separation of background from foreground in video. In this second application, we show that GRASTA performs high-quality separation of moving objects from background at exceptional speeds: In one popular benchmark video example, GRASTA achieves a rate of 57 frames per second, even when run in MATLAB on a personal laptop.

Mixtures of conditional Gaussian scale mixtures applied to multiscale image representations

We present a probabilistic model for natural images which is based on Gaussian scale mixtures and a simple multiscale representation. In contrast to the dominant approach to modeling whole images focusing on Markov random fields, we formulate our model in terms of a directed graphical model. We show that it is able to generate images with interesting higher-order correlations when trained on natural images or samples from an occlusion based model. More importantly, the directed model enables us to perform a principled evaluation. While it is easy to generate visually appealing images, we demonstrate that our model also yields the best performance reported to date when evaluated with respect to the cross-entropy rate, a measure tightly linked to the average log-likelihood.

Learning Discriminative Metrics via Generative Models and Kernel Learning

Metrics specifying distances between data points can be learned in a discriminative manner or from generative models. In this paper, we show how to unify generative and discriminative learning of metrics via a kernel learning framework. Specifically, we learn local metrics optimized from parametric generative models. These are then used as base kernels to construct a global kernel that minimizes a discriminative training criterion. We consider both linear and nonlinear combinations of local metric kernels. Our empirical results show that these combinations significantly improve performance on classification tasks. The proposed learning algorithm is also very efficient, achieving order of magnitude speedup in training time compared to previous discriminative baseline methods.

Differentially Private Online Learning

In this paper, we consider the problem of preserving privacy in the online learning setting. We study the problem in the online convex programming (OCP) framework---a popular online learning setting with several interesting theoretical and practical implications---while using differential privacy as the formal privacy measure. For this problem, we distill two critical attributes that a private OCP algorithm should have in order to provide reasonable privacy as well as utility guarantees: 1) linearly decreasing sensitivity, i.e., as new data points arrive their effect on the learning model decreases, 2) sub-linear regret bound---regret bound is a popular goodness/utility measure of an online learning algorithm. Given an OCP algorithm that satisfies these two conditions, we provide a general framework to convert the given algorithm into a privacy preserving OCP algorithm with good (sub-linear) regret. We then illustrate our approach by converting two popular online learning algorithms into their differentially private variants while guaranteeing sub-linear regret ($O(\sqrt{T})$). Next, we consider the special case of online linear regression problems, a practically important class of online learning problems, for which we generalize an approach by Dwork et al. to provide a differentially private algorithm with just $O(\log^{1.5} T)$ regret. Finally, we show that our online learning framework can be used to provide differentially private algorithms for offline learning as well. For the offline learning problem, our approach obtains better error bounds as well as can handle larger class of problems than the existing state-of-the-art methods Chaudhuri et al.

Convex and Network Flow Optimization for Structured Sparsity

We consider a class of learning problems regularized by a structured sparsity-inducing norm defined as the sum of l_2- or l_infinity-norms over groups of variables. Whereas much effort has been put in developing fast optimization techniques when the groups are disjoint or embedded in a hierarchy, we address here the case of general overlapping groups. To this end, we present two different strategies: On the one hand, we show that the proximal operator associated with a sum of l_infinity-norms can be computed exactly in polynomial time by solving a quadratic min-cost flow problem, allowing the use of accelerated proximal gradient methods. On the other hand, we use proximal splitting techniques, and address an equivalent formulation with non-overlapping groups, but in higher dimension and with additional constraints. We propose efficient and scalable algorithms exploiting these two strategies, which are significantly faster than alternative approaches. We illustrate these methods with several problems such as CUR matrix factorization, multi-task learning of tree-structured dictionaries, background subtraction in video sequences, image denoising with wavelets, and topographic dictionary learning of natural image patches.