AITopics

We describe and analyze an algorithmic framework for online classification where each online trial consists of multiple prediction tasks that are tied together. We tackle the problem of updating the online hypothesis by defining a projection problem in which each prediction task corresponds to a single linear constraint. These constraints are tied together through a single slack parameter. We then introduce a general method for approximately solving the problem by projecting simultaneously and independently on each constraint which corresponds to a prediction sub-problem, and then averaging the individual solutions. We show that this approach constitutes a feasible, albeit not necessarily optimal, solution for the original projection problem. We derive concrete simultaneous projection schemes and analyze them in the mistake bound model. We demonstrate the power of the proposed algorithm in experiments with online multiclass text categorization. Our experiments indicate that a combination of class-dependent features with the simultaneous projection method outperforms previously studied algorithms.

algorithm, constraint, optimization problem, (16 more...)

Country:

North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
North America > United States > California > Santa Clara County > Mountain View (0.04)
Asia > Middle East > Israel > Jerusalem District > Jerusalem (0.04)

Industry:

Education > Educational Setting > Online (1.00)
Education > Educational Technology > Educational Software > Computer Based Training (0.60)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.50)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.47)

Ambroladze, Amiran, Parrado-hernández, Emilio, Shawe-taylor, John S.

Tighter PAC-Bayes Bounds

This paper proposes a PAC-Bayes bound to measure the performance of Support Vector Machine (SVM) classifiers. The bound is based on learning a prior over the distribution of classifiers with a part of the training samples. Experimental work shows that this bound is tighter than the original PAC-Bayes, resulting in an enhancement of the predictive capabilities of the PAC-Bayes bound. In addition, it is shown that the use of this bound as a means to estimate the hyperparameters of the classifier compares favourably with cross validation in terms of accuracy of the model, while saving a lot of computational burden.

classifier, model selection, pac-bayes, (13 more...)

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
North America > United States > California > Orange County > Irvine (0.04)
Europe > United Kingdom (0.04)
(2 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.89)

Davis, Jason V., Dhillon, Inderjit S.

Differential Entropic Clustering of Multivariate Gaussians

Gaussian data is pervasive and many learning algorithms (e.g., k-means) model their inputs as a single sample drawn from a multivariate Gaussian. However, in many real-life settings, each input object is best described by multiple samples drawn from a multivariate Gaussian. Such data can arise, for example, in a movie review database where each movie is rated by several users, or in time-series domains such as sensor networks. Here, each input can be naturally described by both a mean vector and covariance matrix which parameterize the Gaussian distribution. In this paper, we consider the problem of clustering such input objects, each represented as a multivariate Gaussian. We formulate the problem using an information theoretic approach and draw several interesting theoretical connections to Bregman divergences and also Bregman matrix divergences. We evaluate our method across several domains, including synthetic data, sensor network data, and a statistical debugging application.

algorithm, gaussian, multivariate gaussian, (13 more...)

Country:

North America > United States > Texas > Travis County > Austin (0.14)
Europe > Russia (0.04)
Asia > Russia (0.04)
Asia > Middle East > Jordan (0.04)

Industry:

Media > Film (0.54)
Telecommunications (0.49)

Technology:

Information Technology > Communications > Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.70)

Wainwright, Martin J., Lafferty, John D., Ravikumar, Pradeep K.

High-Dimensional Graphical Model Selection Using $\ell_1$-Regularized Logistic Regression

We focus on the problem of estimating the graph structure associated with a discrete Markov random field.

assumption, graphical model, regression, (14 more...)

Country:

North America > United States > California > Alameda County > Berkeley (0.14)
North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
North America > United States > New York (0.04)

Genre:

Research Report > New Finding (0.52)
Research Report > Experimental Study (0.43)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.34)

Efficient Learning of Sparse Representations with an Energy-Based Model

Ranzato, Marc', aurelio, Poultney, Christopher, Chopra, Sumit, Cun, Yann L.

We describe a novel unsupervised method for learning sparse, overcomplete features. Themodel uses a linear encoder, and a linear decoder preceded by a sparsifying non-linearitythat turns a code vector into a quasi-binary sparse code vector. Given an input, the optimal code minimizes the distance between the output of the decoder and the input patch while being as similar as possible to the encoder output.Learning proceeds in a two-phase EMlike fashion: (1) compute the minimum-energy code vector, (2) adjust the parameters of the encoder and decoder soas to decrease the energy. The model produces "stroke detectors" when trained on handwritten numerals, and Gabor-like filters when trained on natural image patches. Inference and learning are very fast, requiring no preprocessing, and no expensive sampling. Using the proposed unsupervised method to initialize the first layer of a convolutional network, we achieved an error rate slightly lower than the best reported result on the MNIST dataset. Finally, an extension of the method is described to learn topographical filter maps.

code vector, decoder, sparsifying logistic, (13 more...)

Country:

North America > United States > New York > New York County > New York City (0.04)
North America > United States > California > San Mateo County > San Mateo (0.04)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Sensing and Signal Processing > Image Processing (0.92)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.69)

Learning to parse images of articulated bodies

Ramanan, Deva

We consider the machine vision task of pose estimation from static images, specifically forthe case of articulated objects. This problem is hard because of the large number of degrees of freedom to be estimated. Following a established line of research, pose estimation is framed as inference in a probabilistic model. In our experience however, the success of many approaches often lie in the power of the features. Our primary contribution is a novel casting of visual inference as an iterative parsingprocess, where one sequentially learns better and better features tuned to a particular image. We show quantitative results for human pose estimation ona database of over 300 images that suggest our algorithm is competitive with or surpasses the state-of-the-art. Since our procedure is quite general (it does not rely on face or skin detection), we also use it to estimate the poses of horses in the Weizmann database.

algorithm, deformable model, parse, (15 more...)

Country: North America > United States > Illinois > Cook County > Chicago (0.05)

Technology:

Information Technology > Artificial Intelligence > Vision > Video Understanding (0.76)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.47)

Nadler, Boaz, Galun, Meirav

Fundamental Limitations of Spectral Clustering

Spectral clustering methods are common graph-based approaches to clustering of data. Spectral clustering algorithms typically start from local information encoded in a weighted graph on the data and cluster according to the global eigenvectors of the corresponding (normalized) similarity matrix. One contribution of this paper is to present fundamental limitations of this general local to global approach. We show that based only on local information, the normalized cut functional is not a suitable measure for the quality of clustering. Further, even with a suitable similarity measure,we show that the first few eigenvectors of such adjacency matrices cannot successfully cluster datasets that contain structures at different scales of size and density. Based on these findings, a second contribution of this paper is a novel diffusion based measure to evaluate the coherence of individual clusters. Our measure can be used in conjunction with any bottom-up graph-based clustering method,it is scale-free and can determine coherent clusters at all scales. We present both synthetic examples and real image segmentation problems where various spectralclustering algorithms fail. In contrast, using this coherence measure finds the expected clusters at all scales.

eigenvector, relaxation time, spectral, (16 more...)

Country:

Asia > Middle East > Israel (0.04)
Europe > Poland (0.04)
Europe > Netherlands (0.04)
(2 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)

Mcneill, Graham, Vijayakumar, Sethu

Part-based Probabilistic Point Matching using Equivalence Constraints

Correspondence algorithms typically struggle with shapes that display part-based variation. We present a probabilistic approach that matches shapes using independent parttransformations, where the parts themselves are learnt during matching. Ideas from semi-supervised learning are used to bias the algorithm towards finding'perceptuallyvalid' part structures. Shapes are represented by unlabeled point sets of arbitrary size and a background component is used to handle occlusion, local dissimilarity and clutter. Thus, unlike many shape matching techniques, our approach can be applied to shapes extracted from real images. Model parameters areestimated using an EM algorithm that alternates between finding a soft correspondence and computing the optimal part transformations using Procrustes analysis.

algorithm, gaussian, transformation, (11 more...)

Country: Europe > United Kingdom (0.04)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.49)
Information Technology > Artificial Intelligence > Machine Learning > Unsupervised or Indirectly Supervised Learning (0.34)

Li, Wenye, Lee, Kin-hong, Leung, Kwong-sak

Generalized Regularized Least-Squares Learning with Predefined Features in a Hilbert Space

Kernel-based regularized learning seeks a model in a hypothesis space by minimizing the empirical error and the model's complexity.

algorithm, dataset, kernel, (13 more...)

Country:

Asia > China > Hong Kong (0.05)
North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
North America > United States > Massachusetts (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Haro, Gloria, Randall, Gregory, Sapiro, Guillermo

Stratification Learning: Detecting Mixed Density and Dimensionality in High Dimensional Point Clouds

The study of point cloud data sampled from a stratification, a collection of manifolds withpossible different dimensions, is pursued in this paper. We present a technique for simultaneously soft clustering and estimating the mixed dimensionality anddensity of such structures. The framework is based on a maximum likelihood estimationof a Poisson mixture model. The presentation of the approach is completed with artificial and real examples demonstrating the importance of extending manifold learning to stratification learning.

dimension, intrinsic dimension, manifold, (13 more...)

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.28)
South America > Uruguay (0.04)
North America > United States > New York (0.04)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.35)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.35)