AITopics | Statistical Learning

Collaborating Authors

Statistical Learning

News Overviews Instructional Materials AI-Alerts Classics

Learning to encode motion using spatio-temporal synchrony

Konda, Kishore Reddy, Memisevic, Roland, Michalski, Vincent

arXiv.org Machine LearningFeb-10-2014

We consider the task of learning to extract motion from videos. To this end, we show that the detection of spatial transformations can be viewed as the detection of synchrony between the image sequence and a sequence of features undergoing the motion we wish to detect. We show that learning about synchrony is possible using very fast, local learning rules, by introducing multiplicative "gating" interactions between hidden units across frames. This makes it possible to achieve competitive performance in a wide variety of motion estimation tasks, using a small fraction of the time required to learn features, and to outperform hand-crafted spatio-temporal features by a large margin. We also show how learning about synchrony can be viewed as performing greedy parameter estimation in the well-known motion energy model.

artificial intelligence, machine learning, synchrony, (18 more...)

arXiv.org Machine Learning

1306.3162

Country: Europe (0.28)

Genre: Research Report (0.64)

Industry: Education (0.34)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)

Add feedback

Feature and Variable Selection in Classification

Karper, Aaron

arXiv.org Machine LearningFeb-10-2014

The amount of information in the form of features and variables avail- able to machine learning algorithms is ever increasing. This can lead to classifiers that are prone to overfitting in high dimensions, high di- mensional models do not lend themselves to interpretable results, and the CPU and memory resources necessary to run on high-dimensional datasets severly limit the applications of the approaches. Variable and feature selection aim to remedy this by finding a subset of features that in some way captures the information provided best. In this paper we present the general methodology and highlight some specific approaches.

artificial intelligence, feature selection, machine learning, (17 more...)

arXiv.org Machine Learning

1402.23

Genre: Research Report (0.84)

Industry: Health & Medicine > Therapeutic Area > Oncology (0.68)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback

Genomic Prediction of Quantitative Traits using Sparse and Locally Epistatic Models

Akdemir, Deniz

arXiv.org Machine LearningFeb-9-2014

In plant and animal breeding studies a distinction is made between the genetic value (additive + epistatic genetic effects) and the breeding value (additive genetic effects) of an individual since it is expected that some of the epistatic genetic effects will be lost due to recombination. In this paper, we argue that the breeder can take advantage of some of the epistatic marker effects in regions of low recombination. The models introduced here aim to estimate local epistatic line heritability by using the genetic map information and combine the local additive and epistatic effects. To this end, we have used semi-parametric mixed models with multiple local genomic relationship matrices with hierarchical designs and lasso post-processing for sparsity in the final model. Our models produce good predictive performance along with good explanatory information.

artificial intelligence, kernel, machine learning, (17 more...)

arXiv.org Machine Learning

1402.2026

Country: North America > United States (0.94)

Genre: Research Report (0.82)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)

Add feedback

Excess Risk Bounds for Exponentially Concave Losses

Mahdavi, Mehrdad, Jin, Rong

arXiv.org Machine LearningFeb-8-2014

The overarching goal of this paper is to derive excess risk bounds for learning from exp-concave loss functions in passive and sequential learning settings. Exp-concave loss functions encompass several fundamental problems in machine learning such as squared loss in linear regression, logistic loss in classification, and negative logarithm loss in portfolio management. In batch setting, we obtain sharp bounds on the performance of empirical risk minimization performed in a linear hypothesis space and with respect to the exp-concave loss functions. We also extend the results to the online setting where the learner receives the training examples in a sequential manner. We propose an online learning algorithm that is a properly modified version of online Newton method to obtain sharp risk bounds. Under an additional mild assumption on the loss function, we show that in both settings we are able to achieve an excess risk bound of $O(d\log n/n)$ that holds with a high probability.

artificial intelligence, excess risk, machine learning, (15 more...)

arXiv.org Machine Learning

1401.4566

Genre: Research Report (0.50)

Industry: Education > Educational Setting > Online (0.49)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.34)

Add feedback

Binary Excess Risk for Smooth Convex Surrogates

Mahdavi, Mehrdad, Zhang, Lijun, Jin, Rong

arXiv.org Machine LearningFeb-7-2014

In statistical learning theory, convex surrogates of the 0-1 loss are highly preferred because of the computational and theoretical virtues that convexity brings in. This is of more importance if we consider smooth surrogates as witnessed by the fact that the smoothness is further beneficial both computationally- by attaining an {\it optimal} convergence rate for optimization, and in a statistical sense- by providing an improved {\it optimistic} rate for generalization bound. In this paper we investigate the smoothness property from the viewpoint of statistical consistency and show how it affects the binary excess risk. We show that in contrast to optimization and generalization errors that favor the choice of smooth surrogate loss, the smoothness of loss function may degrade the binary excess risk. Motivated by this negative result, we provide a unified analysis that integrates optimization error, generalization bound, and the error in translating convex excess risk into a binary excess risk when examining the impact of smoothness on the binary excess risk. We show that under favorable conditions appropriate choice of smooth convex loss will result in a binary excess risk that is better than $O(1/\sqrt{n})$.

artificial intelligence, excess risk, machine learning, (14 more...)

arXiv.org Machine Learning

1402.1792

Genre: Research Report > New Finding (0.68)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback

Alternating Minimization for Mixed Linear Regression

Yi, Xinyang, Caramanis, Constantine, Sanghavi, Sujay

arXiv.org Machine LearningFeb-7-2014

Mixed linear regression involves the recovery of two (or more) unknown vectors from unlabeled linear measurements; that is, where each sample comes from exactly one of the vectors, but we do not know which one. It is a classic problem, and the natural and empirically most popular approach to its solution has been the EM algorithm. As in other settings, this is prone to bad local minima; however, each iteration is very fast (alternating between guessing labels, and solving with those labels). In this paper we provide a new initialization procedure for EM, based on finding the leading two eigenvectors of an appropriate matrix. We then show that with this, a re-sampled version of the EM algorithm provably converges to the correct vectors, under natural assumptions on the sampling distribution, and with nearly optimal (unimprovable) sample complexity. This provides not only the first characterization of EM's performance, but also much lower sample complexity as compared to both standard (randomly initialized) EM, and other methods for this problem.

artificial intelligence, initialization, machine learning, (15 more...)

arXiv.org Machine Learning

1310.3745

Country: North America > United States > Texas (0.14)

Genre: Research Report (0.82)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.71)

Add feedback

A Statistical Learning Theory Framework for Supervised Pattern Discovery

Huggins, Jonathan H., Rudin, Cynthia

arXiv.org Machine LearningFeb-7-2014

This paper formalizes a latent variable inference problem we call {\em supervised pattern discovery}, the goal of which is to find sets of observations that belong to a single ``pattern.'' We discuss two versions of the problem and prove uniform risk bounds for both. In the first version, collections of patterns can be generated in an arbitrary manner and the data consist of multiple labeled collections. In the second version, the patterns are assumed to be generated independently by identically distributed processes. These processes are allowed to take an arbitrary form, so observations within a pattern are not in general independent of each other. The bounds for the second version of the problem are stated in terms of a new complexity measure, the quasi-Rademacher complexity.

crime, machine learning, pattern recognition, (16 more...)

arXiv.org Machine Learning

1307.0802

Country: North America > United States (0.46)

Genre: Research Report (0.64)

Industry:

Health & Medicine > Therapeutic Area > Oncology (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Pattern Recognition (0.88)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.68)

Add feedback

Learning Transformations for Classification Forests

Qiu, Qiang, Sapiro, Guillermo

arXiv.org Machine LearningFeb-6-2014

This work introduces a transformation-based learner model for classification forests. The weak learner at each split node plays a crucial role in a classification tree. We propose to optimize the splitting objective by learning a linear transformation on subspaces using nuclear norm as the optimization criteria. The learned linear transformation restores a low-rank structure for data from the same class, and, at the same time, maximizes the separation between different classes, thereby improving the performance of the split function. Theoretical and experimental results support the proposed framework.

artificial intelligence, learner, machine learning, (14 more...)

arXiv.org Machine Learning

1312.5604

Country: North America > United States (1.00)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback

Beating the Minimax Rate of Active Learning with Prior Knowledge

Zhang, Lijun, Mahdavi, Mehrdad, Jin, Rong

arXiv.org Machine LearningFeb-6-2014

Active learning refers to the learning protocol where the learner is allowed to choose a subset of instances for labeling. Previous studies have shown that, compared with passive learning, active learning is able to reduce the label complexity exponentially if the data are linearly separable or satisfy the Tsybakov noise condition with parameter $\kappa=1$. In this paper, we propose a novel active learning algorithm using a convex surrogate loss, with the goal to broaden the cases for which active learning achieves an exponential improvement. We make use of a convex loss not only because it reduces the computational cost, but more importantly because it leads to a tight bound for the empirical process (i.e., the difference between the empirical estimation and the expectation) when the current solution is close to the optimal one. Under the assumption that the norm of the optimal classifier that minimizes the convex risk is available, our analysis shows that the introduction of the convex surrogate loss yields an exponential reduction in the label complexity even when the parameter $\kappa$ of the Tsybakov noise is larger than $1$. To the best of our knowledge, this is the first work that improves the minimax rate of active learning by utilizing certain priori knowledge.

active learning, artificial intelligence, machine learning, (15 more...)

arXiv.org Machine Learning

1311.4803

Genre: Research Report > New Finding (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.71)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.47)

Add feedback

Near-Optimal Joint Object Matching via Convex Relaxation

Chen, Yuxin, Guibas, Leonidas J., Huang, Qi-Xing

arXiv.org Machine LearningFeb-6-2014

Joint matching over a collection of objects aims at aggregating information from a large collection of similar instances (e.g. images, graphs, shapes) to improve maps between pairs of them. Given multiple matches computed between a few object pairs in isolation, the goal is to recover an entire collection of maps that are (1) globally consistent, and (2) close to the provided maps --- and under certain conditions provably the ground-truth maps. Despite recent advances on this problem, the best-known recovery guarantees are limited to a small constant barrier --- none of the existing methods find theoretical support when more than $50\%$ of input correspondences are corrupted. Moreover, prior approaches focus mostly on fully similar objects, while it is practically more demanding to match instances that are only partially similar to each other. In this paper, we develop an algorithm to jointly match multiple objects that exhibit only partial similarities, given a few pairwise matches that are densely corrupted. Specifically, we propose to recover the ground-truth maps via a parameter-free convex program called MatchLift, following a spectral method that pre-estimates the total number of distinct elements to be matched. Encouragingly, MatchLift exhibits near-optimal error-correction ability, i.e. in the asymptotic regime it is guaranteed to work even when a dominant fraction $1-\Theta\left(\frac{\log^{2}n}{\sqrt{n}}\right)$ of the input maps behave like random outliers. Furthermore, MatchLift succeeds with minimal input complexity, namely, perfect matching can be achieved as soon as the provided maps form a connected map graph. We evaluate the proposed algorithm on various benchmark data sets including synthetic examples and real-world examples, all of which confirm the practical applicability of MatchLift.

artificial intelligence, correspondence, machine learning, (19 more...)

arXiv.org Machine Learning

1402.1473

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.46)

Add feedback