AITopics | Statistical Learning

Collaborating Authors

Statistical Learning

News Overviews Instructional Materials AI-Alerts Classics

Random Spanning Trees and the Prediction of Weighted Graphs

Cesa-Bianchi, Nicolo', Gentile, Claudio, Vitale, Fabio, Zappella, Giovanni

arXiv.org Machine LearningDec-21-2012

We investigate the problem of sequentially predicting the binary labels on the nodes of an arbitrary weighted graph. We show that, under a suitable parametrization of the problem, the optimal number of prediction mistakes can be characterized (up to logarithmic factors) by the cutsize of a random spanning tree of the graph. The cutsize is induced by the unknown adversarial labeling of the graph nodes. In deriving our characterization, we obtain a simple randomized algorithm achieving in expectation the optimal mistake bound on any polynomially connected weighted graph. Our algorithm draws a random spanning tree of the original graph and then predicts the nodes of this tree in constant expected amortized time and linear space. Experiments on real-world datasets show that our method compares well to both global (Perceptron) and local (label propagation) methods, while being generally faster in practice.

artificial intelligence, graph, machine learning, (17 more...)

arXiv.org Machine Learning

1212.5637

Country: North America > United States (0.28)

Genre: Research Report (0.64)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.67)

Technology:

Information Technology > Communications > Networks (0.84)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Perceptrons (0.35)

Add feedback

Mixtures of Shifted Asymmetric Laplace Distributions

Franczak, Brian C., Browne, Ryan P., McNicholas, Paul D.

arXiv.org Machine LearningDec-21-2012

A mixture of shifted asymmetric Laplace distributions is introduced and used for clustering and classification. A variant of the EM algorithm is developed for parameter estimation by exploiting the relationship with the general inverse Gaussian distribution. This approach is mathematically elegant and relatively computationally straightforward. Our novel mixture modelling approach is demonstrated on both simulated and real data to illustrate clustering and classification applications. In these analyses, our mixture of shifted asymmetric Laplace distributions performs favourably when compared to the popular Gaussian approach. This work, which marks an important step in the non-Gaussian model-based clustering and classification direction, concludes with discussion as well as suggestions for future work.

artificial intelligence, machine learning, mixture model, (17 more...)

arXiv.org Machine Learning

doi: 10.1109/TPAMI.2013.216

1207.1727

Country:

North America > United States (0.68)
Europe (0.68)
North America > Canada > Ontario (0.46)

Genre: Research Report (0.82)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Add feedback

Variational Optimization

Staines, Joe, Barber, David

arXiv.org Machine LearningDec-20-2012

We discuss a general technique that can be used to form a differentiable bound on the optima of non-differentiable or discrete objective functions. We form a unified description of these methods and consider under which circumstances the bound is concave. In particular we consider two concrete applications of the method, namely sparse learning and support vector classification.

artificial intelligence, machine learning, objective, (17 more...)

arXiv.org Machine Learning

1212.4507

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.36)

Add feedback

A simpler approach to obtaining an O(1/t) convergence rate for the projected stochastic subgradient method

Lacoste-Julien, Simon, Schmidt, Mark, Bach, Francis

arXiv.org Machine LearningDec-20-2012

In this note, we present a new averaging technique for the projected stochastic subgradient method. By using a weighted average with a weight of t+1 for each iterate w_t at iteration t, we obtain the convergence rate of O(1/t) with both an easy proof and an easy implementation. The new scheme is compared empirically to existing techniques, with similar performance behavior.

artificial intelligence, iterate, machine learning, (16 more...)

arXiv.org Machine Learning

1212.2002

Country: Europe > France (0.14)

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.71)

Add feedback

Automatic post-picking using MAPPOS improves particle image detection from Cryo-EM micrographs

Norousi, Ramin, Wickles, Stephan, Leidig, Christoph, Becker, Thomas, Schmid, Volker J., Beckmann, Roland, Tresch, Achim

arXiv.org Machine LearningDec-19-2012

Cryo-electron microscopy (cryo-EM) studies using single particle reconstruction are extensively used to reveal structural information on macromolecular complexes. Aiming at the highest achievable resolution, state of the art electron microscopes automatically acquire thousands of high-quality micrographs. Particles are detected on and boxed out from each micrograph using fully- or semi-automated approaches. However, the obtained particles still require laborious manual post-picking classification, which is one major bottleneck for single particle analysis of large datasets. We introduce MAPPOS, a supervised post-picking strategy for the classification of boxed particle images, as additional strategy adding to the already efficient automated particle picking routines. MAPPOS employs machine learning techniques to train a robust classifier from a small number of characteristic image features. In order to accurately quantify the performance of MAPPOS we used simulated particle and non-particle images. In addition, we verified our method by applying it to an experimental cryo-EM dataset and comparing the results to the manual classification of the same dataset. Comparisons between MAPPOS and manual post-picking classification by several human experts demonstrated that merely a few hundred sample images are sufficient for MAPPOS to classify an entire dataset with a human-like performance. MAPPOS was shown to greatly accelerate the throughput of large datasets by reducing the manual workload by orders of magnitude while maintaining a reliable identification of non-particle images.

artificial intelligence, dataset, machine learning, (17 more...)

arXiv.org Machine Learning

1212.4871

Country: Europe > Germany > North Rhine-Westphalia (0.14)

Genre: Research Report (1.00)

Industry: Health & Medicine (0.31)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)

Add feedback

A Practical Algorithm for Topic Modeling with Provable Guarantees

Arora, Sanjeev, Ge, Rong, Halpern, Yoni, Mimno, David, Moitra, Ankur, Sontag, David, Wu, Yichen, Zhu, Michael

arXiv.org Machine LearningDec-19-2012

Topic models provide a useful method for dimensionality reduction and exploratory data analysis in large text corpora. Most approaches to topic model inference have been based on a maximum likelihood objective. Efficient algorithms exist that approximate this objective, but they have no provable guarantees. Recently, algorithms have been introduced that provide provable bounds, but these algorithms are not practical because they are inefficient and not robust to violations of model assumptions. In this paper we present an algorithm for topic model inference that is both provable and practical. The algorithm produces results comparable to the best MCMC implementations while running orders of magnitude faster.

artificial intelligence, machine learning, natural language, (16 more...)

arXiv.org Machine Learning

1212.4777

Country:

North America > United States (1.00)
Asia (1.00)

Genre: Research Report (1.00)

Industry:

Leisure & Entertainment > Sports (1.00)
Government > Regional Government > North America Government > United States Government (1.00)
Government > Regional Government > Asia Government (1.00)
Energy (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.89)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.89)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.86)
Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (0.76)

Add feedback

A complexity analysis of statistical learning algorithms

Kon, Mark A.

arXiv.org Machine LearningDec-18-2012

We apply information-based complexity analysis to support vector machine (SVM) algorithms, with the goal of a comprehensive continuous algorithmic analysis of such algorithms. This involves complexity measures in which some higher order operations (e.g., certain optimizations) are considered primitive for the purposes of measuring complexity. We consider classes of information operators and algorithms made up of scaled families, and investigate the utility of scaling the complexities to minimize error. We look at the division of statistical learning into information and algorithmic components, at the complexities of each, and at applications to support vector machine (SVM) and more general machine learning algorithms. We give applications to SVM algorithms graded into linear and higher order components, and give an example in biomedical informatics.

algorithm, artificial intelligence, machine learning, (19 more...)

arXiv.org Machine Learning

1212.4562

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (1.00)

Add feedback

Bayesian Group Nonnegative Matrix Factorization for EEG Analysis

Shin, Bonggun, Oh, Alice

arXiv.org Machine LearningDec-18-2012

We propose a generative model of a group EEG analysis, based on appropriate kernel assumptions on EEG data. We derive the variational inference update rule using various approximation techniques. The proposed model outperforms the current state-of-the-art algorithms in terms of common pattern extraction. The validity of the proposed model is tested on the BCI competition dataset.

artificial intelligence, machine learning, pattern recognition, (12 more...)

arXiv.org Machine Learning

1212.4347

Country: North America > United States (0.28)

Genre: Research Report (0.50)

Industry:

Health & Medicine > Therapeutic Area > Neurology (0.94)
Health & Medicine > Health Care Technology (0.69)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.47)
Information Technology > Artificial Intelligence > Machine Learning > Pattern Recognition (0.37)

Add feedback

Fast nonparametric classification based on data depth

Lange, Tatjana, Mosler, Karl, Mozharovskyi, Pavlo

arXiv.org Machine LearningDec-17-2012

A new procedure, called DDa-procedure, is developed to solve the problem of classifying d-dimensional objects into q >= 2 classes. The procedure is completely nonparametric; it uses q-dimensional depth plots and a very efficient algorithm for discrimination analysis in the depth space [0,1]^q. Specifically, the depth is the zonoid depth, and the algorithm is the alpha-procedure. In case of more than two classes several binary classifications are performed and a majority rule is applied. Special treatments are discussed for 'outsiders', that is, data having zero depth vector. The DDa-classifier is applied to simulated as well as real data, and the results are compared with those of similar procedures that have been recently proposed. In most cases the new procedure has comparable error rates, but is much faster than other classification approaches, including the SVM.

classification, ddα-classifier, upstream oil & gas, (19 more...)

arXiv.org Machine Learning

1207.4992

Genre:

Workflow (0.46)
Research Report (0.40)

Industry:

Health & Medicine (0.47)
Energy > Oil & Gas > Upstream (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.34)

Add feedback

Alternating Maximization: Unifying Framework for 8 Sparse PCA Formulations and Efficient Parallel Codes

Richtárik, Peter, Takáč, Martin, Ahipaşaoğlu, Selin Damla

arXiv.org Machine LearningDec-17-2012

Given a multivariate data set, sparse principal component analysis (SPCA) aims to extract several linear combinations of the variables that together explain the variance in the data as much as possible, while controlling the number of nonzero loadings in these combinations. In this paper we consider 8 different optimization formulations for computing a single sparse loading vector; these are obtained by combining the following factors: we employ two norms for measuring variance (L2, L1) and two sparsity-inducing norms (L0, L1), which are used in two different ways (constraint, penalty). Three of our formulations, notably the one with L0 constraint and L1 variance, have not been considered in the literature. We give a unifying reformulation which we propose to solve via a natural alternating maximization (AM) method. We show the the AM method is nontrivially equivalent to GPower (Journ\'{e}e et al; JMLR 11:517--553, 2010) for all our formulations. Besides this, we provide 24 efficient parallel SPCA implementations: 3 codes (multi-core, GPU and cluster) for each of the 8 problems. Parallelism in the methods is aimed at i) speeding up computations (our GPU code can be 100 times faster than an efficient serial code written in C++), ii) obtaining solutions explaining more variance and iii) dealing with big data problems (our cluster code is able to solve a 357 GB problem in about a minute).

artificial intelligence, data mining, machine learning, (19 more...)

arXiv.org Machine Learning

1212.4137

Country: North America > United States (0.68)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.35)
Information Technology > Data Science > Data Mining > Big Data (0.34)

Add feedback