AITopics | Statistical Learning

Collaborating Authors

Statistical Learning

News Overviews Instructional Materials AI-Alerts Classics

Two Projection Pursuit Algorithms for Machine Learning under Non-Stationarity

arXiv.org Artificial IntelligenceOct-4-2011

This thesis derives, tests and applies two linear projection algorithms for machine learning under non-stationarity. The first finds a direction in a linear space upon which a data set is maximally non-stationary. The second aims to robustify two-way classification against non-stationarity. The algorithm is tested on a key application scenario, namely Brain Computer Interfacing.

artificial intelligence, machine learning, non-stationary source, (17 more...)

arXiv.org Artificial Intelligence

1110.0593

Country: North America > United States (0.28)

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.93)

Industry:

Health & Medicine > Therapeutic Area (0.67)
Education (0.67)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
(2 more...)

Add feedback

Group Lasso with Overlaps: the Latent Group Lasso approach

Obozinski, Guillaume, Jacob, Laurent, Vert, Jean-Philippe

arXiv.org Machine LearningOct-3-2011

We study a norm for structured sparsity which leads to sparse linear predictors whose supports are unions of prede ned overlapping groups of variables. We call the obtained formulation latent group Lasso, since it is based on applying the usual group Lasso penalty on a set of latent variables. A detailed analysis of the norm and its properties is presented and we characterize conditions under which the set of groups associated with latent variables are correctly identi ed. We motivate and discuss the delicate choice of weights associated to each group, and illustrate this approach on simulated data and on the problem of breast cancer prognosis from gene expression data.

artificial intelligence, bioinformatics, machine learning, (18 more...)

arXiv.org Machine Learning

1110.0413

Country: North America > United States > California (0.27)

Genre: Research Report > New Finding (0.45)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Health & Medicine > Therapeutic Area > Oncology (0.66)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
Information Technology > Data Science (0.92)
Information Technology > Biomedical Informatics > Translational Bioinformatics (0.67)

Add feedback

A Nonparametric Frequency Domain EM Algorithm for Time Series Classification with Applications to Spike Sorting and Macro-Economics

Goerg, Georg M.

arXiv.org Machine LearningOct-2-2011

I propose a frequency domain adaptation of the Expectation Maximization (EM) algorithm to group a family of time series in classes of similar dynamic structure. It does this by viewing the magnitude of the discrete Fourier transform (DFT) of each signal (or power spectrum) as a probability density/mass function (pdf/pmf) on the unit circle: signals with similar dynamics have similar pdfs; distinct patterns have distinct pdfs. An advantage of this approach is that it does not rely on any parametric form of the dynamic structure, but can be used for non-parametric, robust and model-free classification. This new method works for non-stationary signals of similar shape as well as stationary signals with similar auto-correlation structure. Applications to neural spike sorting (non-stationary) and pattern-recognition in socio-economic time series (stationary) demonstrate the usefulness and wide applicability of the proposed method.

artificial intelligence, machine learning, spike, (18 more...)

arXiv.org Machine Learning

1103.33

Country: North America > United States (1.00)

Genre: Research Report (1.00)

Industry:

Banking & Finance > Economy (1.00)
Health & Medicine > Therapeutic Area > Neurology (0.68)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.48)

Add feedback

Kernel Bayes' rule

Fukumizu, Kenji, Song, Le, Gretton, Arthur

arXiv.org Machine LearningSep-28-2011

Kernel methods have long provided powerful tools for generalizing linear statistical approaches to nonlinear settings, through an embedding of the sample to a high dimensional feature space, namely a reproducing kernel Hilbert space (RKHS) [18, 28]. Examples include support vector machines, kernel PCA, and kernel CCA, among others. In these cases, data are mapped via a canonical feature map to a reproducing kernel Hilbert space (of high or even infinite dimension), in which the linear operations that define the algorithms are implemented. The inner product between feature mappings need never be computed explicitly, but is given by a positive definite kernel function unique to the RKHS: this permits efficient computation without the need to deal explicitly with the feature representation. The mappings of individual points to a feature space may be generalized to mappings of probability measures[e.g. 3, Chapter 4]. We call such mappings the kernel means of the underlying random variables.

artificial intelligence, bayesian inference, machine learning, (16 more...)

arXiv.org Machine Learning

1009.5736

Country: North America > United States (0.28)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.85)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.66)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.54)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)

Add feedback

Learning Item Trees for Probabilistic Modelling of Implicit Feedback

Mnih, Andriy, Teh, Yee Whye

arXiv.org Machine LearningSep-27-2011

User preferences for items can be inferred from either explicit feedback, such as item ratings, or implicit feedback, such as rental histories. Research in collaborative filtering has concentrated on explicit feedback, resulting in the development of accurate and scalable models. However, since explicit feedback is often difficult to collect it is important to develop effective models that take advantage of the more widely available implicit feedback. We introduce a probabilistic approach to collaborative filtering with implicit feedback based on modelling the user's item selection process. In the interests of scalability, we restrict our attention to tree-structured distributions over items and develop a principled and efficient algorithm for learning item trees from data. We also identify a problem with a widely used protocol for evaluating implicit feedback models and propose a way of addressing it using a small quantity of explicit feedback data.

algorithm, artificial intelligence, machine learning, (15 more...)

arXiv.org Machine Learning

1109.5894

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback

Unsupervised K-Nearest Neighbor Regression

Kramer, Oliver

arXiv.org Machine LearningSep-26-2011

In many scientific disciplines structures in high-dimensional data have to be found, e.g., in stellar spectra, in genome data, or in face recognition tasks. In this work we present a novel approach to non-linear dimensionality reduction. It is based on fitting K-nearest neighbor regression to the unsupervised regression framework for learning of low-dimensional manifolds. Similar to related approaches that are mostly based on kernel methods, unsupervised K-nearest neighbor (UNN) regression optimizes latent variables w.r.t. the data space reconstruction error employing the K-nearest neighbor heuristic. The problem of optimizing latent neighborhoods is difficult to solve, but the UNN formulation allows the design of efficient strategies that iteratively embed latent points to fixed neighborhood topologies. UNN is well appropriate for sorting of high-dimensional data. The iterative variants are analyzed experimentally.

artificial intelligence, machine learning, regression, (15 more...)

arXiv.org Machine Learning

1107.36

Country:

North America > United States (0.15)
Europe > Germany (0.14)

Genre: Research Report (0.70)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Nearest Neighbor Methods (1.00)

Add feedback

Higher-Order Markov Tag-Topic Models for Tagged Documents and Images

Zeng, Jia, Feng, Wei, Cheung, William K., Li, Chun-Hung

arXiv.org Artificial IntelligenceSep-25-2011

This paper studies the topic modeling problem of tagged documents and images. Higher-order relations among tagged documents and images are major and ubiquitous characteristics, and play positive roles in extracting reliable and interpretable topics. In this paper, we propose the tag-topic models (TTM) to depict such higher-order topic structural dependencies within the Markov random field (MRF) framework. First, we use the novel factor graph representation of latent Dirichlet allocation (LDA)-based topic models from the MRF perspective, and present an efficient loopy belief propagation (BP) algorithm for approximate inference and parameter estimation. Second, we propose the factor hypergraph representation of TTM, and focus on both pairwise and higher-order relation modeling among tagged documents and images. Efficient loopy BP algorithm is developed to learn TTM, which encourages the topic labeling smoothness among tagged documents and images. Extensive experimental results confirm the incorporation of higher-order relations to be effective in enhancing the overall topic modeling performance, when compared with current state-of-the-art topic models, in many text and image mining tasks of broad interests such as word and link prediction, document classification, and tag recommendation.

bayesian inference, relation, text processing, (20 more...)

arXiv.org Artificial Intelligence

1109.537

Country:

Asia > China > Hong Kong (0.14)
North America > United States > Maryland > Prince George's County > College Park (0.14)

Genre: Research Report > New Finding (0.34)

Industry: Energy > Oil & Gas (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (1.00)
(3 more...)

Add feedback

Bias Plus Variance Decomposition for Survival Analysis Problems

Sapir, Marina

arXiv.org Machine LearningSep-24-2011

For classification problems, it is well known that bias and variance components of the estimation prediction error combine to influence classification in a very different way, and have different importance depending on the sample size. For small and for high-dimensional datasets, variance of the prediction caused by variations in the training samples makes largest contribution into the expected prediction error. For large datasets, bias becomes more important component of the error [1]. Thus, the decomposition of expected error into bias and variance parts is an important tool to understand differences between the algorithms, to find areas of the optimal application. To the best of author's knowledge, such decomposition was not proposed for survival analysis problem. Here we describe an approach to define this decomposition for this class of problem. On two real life datasets we study how bias and variance of we show how regularization and size of the training sample affect bias, variance and overall errors of the methods. 1 2 Bias - Variance Decomposition

algorithm, dataset, decomposition, (11 more...)

arXiv.org Machine Learning

1109.5311

Country: North America > United States > New York (0.05)

Genre: Research Report > New Finding (0.49)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.31)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback

Minimum Probability Flow Learning

Sohl-Dickstein, Jascha, Battaglino, Peter, DeWeese, Michael R.

arXiv.org Machine LearningSep-24-2011

Fitting probabilistic models to data is often difficult, due to the general intractability of the partition function and its derivatives. Here we propose a new parameter estimation technique that does not require computing an intractable normalization factor or sampling from the equilibrium distribution of the model. This is achieved by establishing dynamics that would transform the observed data distribution into the model distribution, and then setting as the objective the minimization of the KL divergence between the data distribution and the distribution produced by running the dynamics for an infinitesimal time. Score matching, minimum velocity learning, and certain forms of contrastive divergence are shown to be special cases of this learning technique. We demonstrate parameter estimation in Ising models, deep belief networks and an independent component analysis model of natural scenes. In the Ising model case, current state of the art techniques are outperformed by at least an order of magnitude in learning time, with lower error in recovered coupling parameters.

artificial intelligence, deep learning, machine learning, (16 more...)

arXiv.org Machine Learning

0906.4779

Country: North America > United States (0.46)

Genre: Research Report (0.84)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.90)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.50)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.49)

Add feedback

Sparse Choice Models

Farias, Vivek F., Jagabathula, Srikanth, Shah, Devavrat

arXiv.org Machine LearningSep-21-2011

Choice models, which capture popular preferences over objects of interest, play a key role in making decisions whose eventual outcome is impacted by human choice behavior. In most scenarios, the choice model, which can effectively be viewed as a distribution over permutations, must be learned from observed data. The observed data, in turn, may frequently be viewed as (partial, noisy) information about marginals of this distribution over permutations. As such, the search for an appropriate choice model boils down to learning a distribution over permutations that is (near-)consistent with observed information about this distribution. In this work, we pursue a non-parametric approach which seeks to learn a choice model (i.e. a distribution over permutations) with {\em sparsest} possible support, and consistent with observed data. We assume that the data observed consists of noisy information pertaining to the marginals of the choice model we seek to learn. We establish that {\em any} choice model admits a `very' sparse approximation in the sense that there exists a choice model whose support is small relative to the dimension of the observed data and whose marginals approximately agree with the observed marginal information. We further show that under, what we dub, `signature' conditions, such a sparse approximation can be found in a computationally efficiently fashion relative to a brute force approach. An empirical study using the American Psychological Association election data-set suggests that our approach manages to unearth useful structural properties of the underlying choice model using the sparse approximation found. Our results further suggest that the signature condition is a potential alternative to the recently popularized Restricted Null Space condition for efficient recovery of sparse models.

artificial intelligence, choice model, machine learning, (19 more...)

arXiv.org Machine Learning

1011.4339

Country: North America > United States (1.00)

Genre: Research Report > New Finding (0.48)

Industry: Transportation (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.48)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.46)

Add feedback