Country
TD_gamma: Re-evaluating Complex Backups in Temporal Difference Learning
Konidaris, George, Niekum, Scott, Thomas, Philip S.
We show that the lambda-return target used in the TD(lambda) family of algorithms is the maximum likelihood estimator for a specific model of how the variance of an n-step return estimate increases with n. We introduce the gamma-return estimator, an alternative target based on a more accurate model of variance, which defines the TD_gamma family of complex-backup temporal difference learning algorithms. We derive TD_gamma, the gamma-return equivalent of the original TD(lambda) algorithm, which eliminates the lambda parameter but can only perform updates at the end of an episode and requires time and space proportional to the episode length. We then derive a second algorithm, TD_gamma(C), with a capacity parameter C. TD_gamma(C) requires C times more time and memory than TD(lambda) and is incremental and online. We show that TD_gamma outperforms TD(lambda) for any setting of lambda on 4 out of 5 benchmark domains, and that TD_gamma(C) performs as well as or better than TD_gamma for intermediate settings of C.
Multiple Instance Filtering
Wnuk, Kamil A., Soatto, Stefano
We propose a robust filtering approach based on semi-supervised and multiple instance learning (MIL). We assume that the posterior density would be unimodal if not for the effect of outliers that we do not wish to explicitly model. Therefore, we seek for a point estimate at the outset, rather than a generic approximation of the entire posterior. Our approach can be thought of as a combination of standard finite-dimensional filtering (Extended Kalman Filter, or Unscented Filter) with multiple instance learning, whereby the initial condition comes with a putative set of inlier measurements. We show how both the state (regression) and the inlier set (classification) can be estimated iteratively and causally by processing only the current measurement. We illustrate our approach on visual tracking problems whereby the object of interest (target) moves and evolves as a result of occlusions and deformations, and partial knowledge of the target is given in the form of a bounding box (training set).
Dimensionality Reduction Using the Sparse Linear Model
Gkioulekas, Ioannis A., Zickler, Todd
We propose an approach for linear unsupervised dimensionality reduction, based on the sparse linear model that has been used to probabilistically interpret sparse coding. We formulate an optimization problem for learning a linear projection from the original signal domain to a lower-dimensional one in a way that approximately preserves, in expectation, pairwise inner products in the sparse domain. We derive solutions to the problem, present nonlinear extensions, and discuss relations to compressed sensing. Our experiments using facial images, texture patches, and images of object categories suggest that the approach can improve our ability to recover meaningful structure in many classes of signals.
Large-Scale Category Structure Aware Image Categorization
Zhao, Bin, Li, Fei, Xing, Eric P.
Most previous research on image categorization has focused on medium-scale data sets, while large-scale image categorization with millions of images from thousands of categories remains a challenge. With the emergence of structured large-scale dataset such as the ImageNet, rich information about the conceptual relationships between images, such as a tree hierarchy among various image categories, become available. As human cognition of complex visual world benefits from underlying semantic relationships between object classes, we believe a machine learning system can and should leverage such information as well for better performance. In this paper, we employ such semantic relatedness among image categories for large-scale image categorization. Specifically, a category hierarchy is utilized to properly define loss function and select common set of features for related categories. An efficient optimization method based on proximal approximation and accelerated parallel gradient method is introduced. Experimental results on a subset of ImageNet containing 1.2 million images from 1000 categories demonstrate the effectiveness and promise of our proposed approach.
Directed Graph Embedding: an Algorithm based on Continuous Limits of Laplacian-type Operators
Perrault-joncas, Dominique C., Meila, Marina
This paper considers the problem of embedding directed graphs in Euclidean space while retaining directional information. We model the observed graph as a sample from a manifold endowed with a vector field, and we design an algo- rithm that separates and recovers the features of this process: the geometry of the manifold, the data density and the vector field. The algorithm is motivated by our analysis of Laplacian-type operators and their continuous limit as generators of diffusions on a manifold. We illustrate the recovery algorithm on both artificially constructed and real data.
Robust Multi-Class Gaussian Process Classification
Hernรกndez-lobato, Daniel, Hernรกndez-lobato, Jose M., Dupont, Pierre
Multi-class Gaussian Process Classifiers (MGPCs) are often affected by over-fitting problems when labeling errors occur far from the decision boundaries. To prevent this, we investigate a robust MGPC (RMGPC) which considers labeling errors independently of their distance to the decision boundaries. Expectation propagation is used for approximate inference. Experiments with several datasets in which noise is injected in the class labels illustrate the benefits of RMGPC. This method performs better than other Gaussian process alternatives based on considering latent Gaussian noise or heavy-tailed processes. When no noise is injected in the labels, RMGPC still performs equal or better than the other methods. Finally, we show how RMGPC can be used for successfully identifying data instances which are difficult to classify accurately in practice.
On U-processes and clustering performance
Many clustering techniques aim at optimizing empirical criteria that are of the form of a U-statistic of degree two. Given a measure of dissimilarity between pairs of observations, the goal is to minimize the within cluster point scatter over a class of partitions of the feature space. It is the purpose of this paper to define a general statistical framework, relying on the theory of U-processes, for studying the performance of such clustering methods. In this setup, under adequate assumptions on the complexity of the subsets forming the partition candidates, the excess of clustering risk is proved to be of the order O(1/\sqrt{n}). Based on recent results related to the tail behavior of degenerate U-processes, it is also shown how to establish tighter rate bounds. Model selection issues, related to the number of clusters forming the data partition in particular, are also considered.
Policy Gradient Coagent Networks
We present a novel class of actor-critic algorithms for actors consisting of sets of interacting modules. We present, analyze theoretically, and empirically evaluate an update rule for each module, which requires only local information: the module's input, output, and the TD error broadcast by a critic. Such updates are necessary when computation of compatible features becomes prohibitively difficult and are also desirable to increase the biological plausibility of reinforcement learning methods.
The Manifold Tangent Classifier
Rifai, Salah, Dauphin, Yann N., Vincent, Pascal, Bengio, Yoshua, Muller, Xavier
We combine three important ideas present in previous work for building classifiers: thesemi-supervised hypothesis (the input distribution contains information about the classifier), the unsupervised manifold hypothesis (data density concentrates nearlow-dimensional manifolds), and the manifold hypothesis for classification (differentclasses correspond to disjoint manifolds separated by low density). We exploit a novel algorithm for capturing manifold structure (high-order contractive auto-encoders) and we show how it builds a topological atlas of charts, each chart being characterized by the principal singular vectors of the Jacobian of a representation mapping. This representation learning algorithm can be stacked to yield a deep architecture, and we combine it with a domain knowledge-free version of the TangentProp algorithm to encourage the classifier to be insensitive to local directions changes along the manifold. Record-breaking classification results are obtained.