AITopics | Statistical Learning

Collaborating Authors

Statistical Learning

News Overviews Instructional Materials AI-Alerts Classics

The Statistics of Streaming Sparse Regression

Steinhardt, Jacob, Wager, Stefan, Liang, Percy

arXiv.org Machine LearningDec-12-2014

We present a sparse analogue to stochastic gradient descent that is guaranteed to perform well under similar conditions to the lasso. In the linear regression setup with irrepresentable noise features, our algorithm recovers the support set of the optimal parameter vector with high probability, and achieves a statistically quasi-optimal rate of convergence of Op(k log(d)/T), where k is the sparsity of the solution, d is the number of features, and T is the number of training examples. Meanwhile, our algorithm does not require any more computational resources than stochastic gradient descent. In our experiments, we find that our method substantially out-performs existing streaming algorithms on both real and simulated data.

algorithm, artificial intelligence, machine learning, (18 more...)

arXiv.org Machine Learning

1412.4182

Country: North America > United States (0.46)

Genre: Research Report > New Finding (0.48)

Industry: Health & Medicine (0.67)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.95)

Add feedback

Machine Learning for Neuroimaging with Scikit-Learn

Abraham, Alexandre, Pedregosa, Fabian, Eickenberg, Michael, Gervais, Philippe, Muller, Andreas, Kossaifi, Jean, Gramfort, Alexandre, Thirion, Bertrand, Varoquaux, Gäel

arXiv.org Machine LearningDec-12-2014

Statistical machine learning methods are increasingly used for neuroimaging data analysis. Their main virtue is their ability to model high-dimensional datasets, e.g. multivariate analysis of activation images or resting-state time series. Supervised learning is typically used in decoding or encoding settings to relate brain images to behavioral or clinical observations, while unsupervised learning can uncover hidden structures in sets of images (e.g. resting state functional MRI) or find sub-populations in large cohorts. By considering different functional neuroimaging applications, we illustrate how scikit-learn, a Python machine learning library, can be used to perform some key analysis steps. Scikit-learn contains a very large set of statistical learning algorithms, both supervised and unsupervised, and its application to neuroimaging data provides a versatile tool to study the brain.

artificial intelligence, machine learning, voxel, (16 more...)

arXiv.org Machine Learning

1412.3919

Country:

Europe (1.00)
North America > United States (0.28)

Genre: Research Report (0.84)

Industry:

Health & Medicine > Therapeutic Area > Neurology (1.00)
Health & Medicine > Health Care Technology (1.00)
Health & Medicine > Diagnostic Medicine > Imaging (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.47)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.46)

Add feedback

Score Function Features for Discriminative Learning: Matrix and Tensor Framework

Janzamin, Majid, Sedghi, Hanie, Anandkumar, Anima

arXiv.org Machine LearningDec-11-2014

Feature learning forms the cornerstone for tackling challenging learning problems in domains such as speech, computer vision and natural language processing. In this paper, we consider a novel class of matrix and tensor-valued features, which can be pre-trained using unlabeled samples. We present efficient algorithms for extracting discriminative information, given these pre-trained features and labeled samples for any related task. Our class of features are based on higher-order score functions, which capture local variations in the probability density function of the input. We establish a theoretical framework to characterize the nature of discriminative information that can be extracted from score-function features, when used in conjunction with labeled samples. We employ efficient spectral decomposition algorithms (on matrices and tensors) for extracting discriminative components. The advantage of employing tensor-valued features is that we can extract richer discriminative information in the form of an overcomplete representations. Thus, we present a novel framework for employing generative models of the input for discriminative learning.

artificial intelligence, machine learning, natural language, (20 more...)

arXiv.org Machine Learning

1412.2863

Country:

North America > United States > California > Orange County > Irvine (0.04)
North America > United States > Illinois > Cook County > Chicago (0.04)
North America > United States > California > Alameda County > Berkeley (0.04)

Genre: Research Report (0.50)

Industry: Education (0.34)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models (0.68)
(2 more...)

Add feedback

Deep Multi-Instance Transfer Learning

Kotzias, Dimitrios, Denil, Misha, Blunsom, Phil, de Freitas, Nando

arXiv.org Machine LearningDec-10-2014

We present a new approach for transferring knowledge from groups to individuals that comprise them. We evaluate our method in text, by inferring the ratings of individual sentences using full-review ratings. This approach, which combines ideas from transfer learning, deep learning and multi-instance learning, reduces the need for laborious human labelling of fine-grained data when abundant labels are available at the group level.

artificial intelligence, machine learning, sentiment, (15 more...)

arXiv.org Machine Learning

1411.3128

Country:

North America > United States > California (0.28)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.14)

Genre: Research Report (0.64)

Industry:

Leisure & Entertainment (0.95)
Media > Film (0.69)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.89)

Add feedback

Inexact Coordinate Descent: Complexity and Preconditioning

Tappenden, Rachael, Richtárik, Peter, Gondzio, Jacek

arXiv.org Artificial IntelligenceDec-10-2014

In this paper we consider the problem of minimizing a convex function using a randomized block coordinate descent method. One of the key steps at each iteration of the algorithm is determining the update to a block of variables. Existing algorithms assume that in order to compute the update, a particular subproblem is solved exactly. In his work we relax this requirement, and allow for the subproblem to be solved inexactly, leading to an inexact block coordinate descent method. Our approach incorporates the best known results for exact updates as a special case. Moreover, these theoretical guarantees are complemented by practical considerations: the use of iterative techniques to determine the update as well as the use of preconditioning for further acceleration.

algorithm, artificial intelligence, machine learning, (18 more...)

arXiv.org Artificial Intelligence

1304.553

Country:

North America > United States > Wisconsin (0.28)
Europe > United Kingdom (0.28)

Genre: Research Report (0.63)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.45)

Add feedback

Bayesian Fisher's Discriminant for Functional Data

Yang, Yao-Hsiang, Chen, Lu-Hung, Wang, Chieh-Chih, Chen, Chu-Song

arXiv.org Machine LearningDec-9-2014

We propose a Bayesian framework of Gaussian process in order to extend Fisher's discriminant to classify functional data such as spectra and images. The probability structure for our extended Fisher's discriminant is explicitly formulated, and we utilize the smoothness assumptions of functional data as prior probabilities. Existing methods which directly employ the smoothness assumption of functional data can be shown as special cases within this framework given corresponding priors while their estimates of the unknowns are one-step approximations to the proposed MAP estimates. Empirical results on various simulation studies and different real applications show that the proposed method significantly outperforms the other Fisher's discriminant methods for functional data.

artificial intelligence, bayesian inference, machine learning, (16 more...)

arXiv.org Machine Learning

1412.2929

Country: Europe > Austria (0.28)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Add feedback

Hierarchical Mixture-of-Experts Model for Large-Scale Gaussian Process Regression

Ng, Jun Wei, Deisenroth, Marc Peter

arXiv.org Machine LearningDec-9-2014

We propose a practical and scalable Gaussian process model for large-scale nonlinear probabilistic regression. Our mixture-of-experts model is conceptually simple and hierarchically recombines computations for an overall approximation of a full Gaussian process. Closed-form and distributed computations allow for efficient and massive parallelisation while keeping the memory consumption small. Given sufficient computing resources, our model can handle arbitrarily large data sets, without explicit sparse approximations. We provide strong experimental evidence that our model can be applied to large data sets of sizes far beyond millions. Hence, our model has the potential to lay the foundation for general large-scale Gaussian process research.

artificial intelligence, machine learning, modeling & simulation, (17 more...)

arXiv.org Machine Learning

1412.3078

Country: North America > United States (0.28)

Genre: Research Report (0.64)

Technology:

Information Technology > Modeling & Simulation (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Robots (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)

Add feedback

ROP: Matrix recovery via rank-one projections

Cai, T. Tony, Zhang, Anru

arXiv.org Machine LearningDec-9-2014

Estimation of low-rank matrices is of significant interest in a range of contemporary applications. In this paper, we introduce a rank-one projection model for low-rank matrix recovery and propose a constrained nuclear norm minimization method for stable recovery of low-rank matrices in the noisy case. The procedure is adaptive to the rank and robust against small perturbations. Both upper and lower bounds for the estimation accuracy under the Frobenius norm loss are obtained. The proposed estimator is shown to be rate-optimal under certain conditions. The estimator is easy to implement via convex programming and performs well numerically. The techniques and main results developed in the paper also have implications to other related statistical problems. An application to estimation of spiked covariance matrices from one-dimensional random projections is considered. The results demonstrate that it is still possible to accurately estimate the covariance matrix of a high-dimensional distribution based only on one-dimensional projections.

artificial intelligence, machine learning, matrix, (16 more...)

arXiv.org Machine Learning

doi: 10.1214/14-AOS1267

1310.5791

Country: North America > United States > Pennsylvania (0.28)

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)

Add feedback

Unsupervised Induction of Semantic Roles within a Reconstruction-Error Minimization Framework

Titov, Ivan, Khoddam, Ehsan

arXiv.org Machine LearningDec-8-2014

We introduce a new approach to unsupervised estimation of feature-rich semantic role labeling models. Our model consists of two components: (1) an encoding component: a semantic role labeling model which predicts roles given a rich set of syntactic and lexical features; (2) a reconstruction component: a tensor factorization model which relies on roles to predict argument fillers. When the components are estimated jointly to minimize errors in argument reconstruction, the induced roles largely correspond to roles defined in annotated resources. Our method performs on par with most accurate role induction methods on English and German, even though, unlike these previous approaches, we do not incorporate any prior linguistic knowledge about the languages.

argument, machine learning, natural language, (17 more...)

arXiv.org Machine Learning

1412.2812

Country: North America > United States (0.46)

Genre: Research Report (0.83)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (1.00)
(2 more...)

Add feedback

Low Complexity Regularization of Linear Inverse Problems

Vaiter, Samuel, Peyré, Gabriel, Fadili, Jalal M.

arXiv.org Machine LearningDec-8-2014

Inverse problems and regularization theory is a central theme in contemporary signal processing, where the goal is to reconstruct an unknown signal from partial indirect, and possibly noisy, measurements of it. A now standard method for recovering the unknown signal is to solve a convex optimization problem that enforces some prior knowledge about its structure. This has proved efficient in many problems routinely encountered in imaging sciences, statistics and machine learning. This chapter delivers a review of recent advances in the field where the regularization prior promotes solutions conforming to some notion of simplicity/low-complexity. These priors encompass as popular examples sparsity and group sparsity (to capture the compressibility of natural signals and images), total variation and analysis sparsity (to promote piecewise regularity), and low-rank (as natural extension of sparsity to matrix-valued data). Our aim is to provide a unified treatment of all these regularizations under a single umbrella, namely the theory of partial smoothness. This framework is very general and accommodates all low-complexity regularizers just mentioned, as well as many others. Partial smoothness turns out to be the canonical way to encode low-dimensional models that can be linear spaces or more general smooth manifolds. This review is intended to serve as a one stop shop toward the understanding of the theoretical properties of the so-regularized solutions. It covers a large spectrum including: (i) recovery guarantees and stability to noise, both in terms of $\ell^2$-stability and model (manifold) identification; (ii) sensitivity analysis to perturbations of the parameters involved (in particular the observations), with applications to unbiased risk estimation ; (iii) convergence properties of the forward-backward proximal splitting scheme, that is particularly well suited to solve the corresponding large-scale regularized optimization problem.

artificial intelligence, data quality, machine learning, (16 more...)

arXiv.org Machine Learning

1407.1598

Country:

Europe (0.28)
North America > United States (0.27)

Genre:

Overview (1.00)
Research Report > New Finding (0.67)

Industry:

Health & Medicine > Diagnostic Medicine > Imaging (0.48)
Energy (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Sensing and Signal Processing > Image Processing (0.93)
Information Technology > Data Science > Data Quality (0.92)

Add feedback