AITopics | arXiv.org Machine Learning

Plotting

arXiv.org Machine Learning

Sparsification and feature selection by compressive linear regression

arXiv.org Machine LearningOct-21-2009

The Minimum Description Length (MDL) principle states that the optimal model for a given data set is that which compresses it best. Due to practial limitations the model can be restricted to a class such as linear regression models, which we address in this study. As in other formulations such as the LASSO and forward step-wise regression we are interested in sparsifying the feature set while preserving generalization ability. We derive a well-principled set of codes for both parameters and error residuals along with smooth approximations to lengths of these codes as to allow gradient descent optimization of description length, and go on to show that sparsification and feature selection using our approach is faster than the LASSO on several datasets from the UCI and StatLib repositories, with favorable generalization accuracy, while being fully automatic, requiring neither cross-validation nor tuning of regularization hyper-parameters, allowing even for a nonlinear expansion of the feature set followed by sparsification.

artificial intelligence, linear regression, machine learning, (18 more...)

arXiv.org Machine Learning

0910.4135

Country:

North America > United States (0.14)
Europe > Germany (0.14)

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory > Minimum Complexity Machines (0.87)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Add feedback

Efficient Bayesian analysis of multiple changepoint models with dependence across segments

Fearnhead, Paul, Liu, Zhen

arXiv.org Machine LearningOct-16-2009

We consider Bayesian analysis of a class of multiple changepoint models. While there are a variety of efficient ways to analyse these models if the parameters associated with each segment are independent, there are few general approaches for models where the parameters are dependent. Under the assumption that the dependence is Markov, we propose an efficient online algorithm for sampling from an approximation to the posterior distribution of the number and position of the changepoints. In a simulation study, we show that the approximation introduced is negligible. We illustrate the power of our approach through fitting piecewise polynomial models to data, under a model which allows for either continuity or discontinuity of the underlying curve at each changepoint. This method is competitive with, or out-performs, other methods for inferring curves from noisy data; and uniquely it allows for inference of the locations of discontinuities in the underlying curve.

bayesian inference, changepoint, upstream oil & gas, (18 more...)

arXiv.org Machine Learning

0910.3099

Genre: Research Report (0.64)

Industry: Energy > Oil & Gas > Upstream (0.47)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.67)
Information Technology > Modeling & Simulation (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Add feedback

A Stochastic Model for Collaborative Recommendation

Biau, Gérard, Cadre, Benoit, Rouvière, Laurent

arXiv.org Machine LearningOct-13-2009

Collaborative recommendation is an information-filtering technique that attempts to present information items (movies, music, books, news, images, Web pages, etc.) that are likely of interest to the Internet user. Traditionally, collaborative systems deal with situations with two types of variables, users and items. In its most common form, the problem is framed as trying to estimate ratings for items that have not yet been consumed by a user. Despite wide-ranging literature, little is known about the statistical properties of recommendation systems. In fact, no clear probabilistic model even exists allowing us to precisely describe the mathematical forces driving collaborative filtering. To provide an initial contribution to this, we propose to set out a general sequential stochastic model for collaborative recommendation and analyze its asymptotic performance as the number of users grows. We offer an in-depth analysis of the so-called cosine-type nearest neighbor collaborative method, which is one of the most widely used algorithms in collaborative filtering. We establish consistency of the procedure under mild assumptions on the model. Rates of convergence and examples are also provided.

artificial intelligence, collaborative filtering, social media, (19 more...)

arXiv.org Machine Learning

0910.2340

Country:

Europe > France (0.14)
North America > United States (0.14)

Genre: Research Report (0.82)

Industry:

Media > Film (0.93)
Leisure & Entertainment (0.68)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (1.00)

Add feedback

$L_0$ regularized estimation for nonlinear models that have sparse underlying linear structures

Chi, Zhiyi

arXiv.org Machine LearningOct-13-2009

We study the estimation of $\beta$ for the nonlinear model $y = f(X\sp{\top}\beta) + \epsilon$ when $f$ is a nonlinear transformation that is known, $\beta$ has sparse nonzero coordinates, and the number of observations can be much smaller than that of parameters ($n\ll p$). We show that in order to bound the $L_2$ error of the $L_0$ regularized estimator $\hat\beta$, i.e., $\|\hat\beta - \beta\|_2$, it is sufficient to establish two conditions. Based on this, we obtain bounds of the $L_2$ error for (1) $L_0$ regularized maximum likelihood estimation (MLE) for exponential linear models and (2) $L_0$ regularized least square (LS) regression for the more general case where $f$ is analytic. For the analytic case, we rely on power series expansion of $f$, which requires taking into account the singularities of $f$.

artificial intelligence, machine learning, spt, (17 more...)

arXiv.org Machine Learning

0910.2517

Country: North America > United States > Connecticut (0.14)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.54)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.54)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.46)

Add feedback

Mean-Field Theory of Meta-Learning

Plewczynski, Dariusz

arXiv.org Machine LearningOct-9-2009

We discuss here the mean-field theory for a cellular automata model of meta-learning. The meta-learning is the process of combining outcomes of individual learning procedures in order to determine the final decision with higher accuracy than any single learning method. Our method is constructed from an ensemble of interacting, learning agents, that acquire and process incoming information using various types, or different versions of machine learning algorithms. The abstract learning space, where all agents are located, is constructed here using a fully connected model that couples all agents with random strength values. The cellular automata network simulates the higher level integration of information acquired from the independent learning trials. The final classification of incoming input data is therefore defined as the stationary state of the meta-learning system using simple majority rule, yet the minority clusters that share opposite classification outcome can be observed in the system. Therefore, the probability of selecting proper class for a given input data, can be estimated even without the prior knowledge of its affiliation. The fuzzy logic can be easily introduced into the system, even if learning agents are build from simple binary classification machine learning algorithms by calculating the percentage of agreeing agents.

agent, immunology, upstream oil & gas, (20 more...)

arXiv.org Machine Learning

doi: 10.1088/1742-5468/2009/11/P11003

0907.4643

Country: Europe > Poland (0.28)

Genre: Research Report (0.50)

Industry:

Health & Medicine > Therapeutic Area > Immunology (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (0.94)
Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (0.93)
Education (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.46)

Add feedback

BRAINSTORMING: Consensus Learning in Practice

Plewczynski, Dariusz

arXiv.org Machine LearningOct-6-2009

We present here an introduction to Brainstorming approach, that was recently proposed as a consensus meta-learning technique, and used in several practical applications in bioinformatics and chemoinformatics. The consensus learning denotes heterogeneous theoretical classification method, where one trains an ensemble of machine learning algorithms using different types of input training data representations. In the second step all solutions are gathered and the consensus is build between them. Therefore no early solution, given even by a generally low performing algorithm, is not discarder until the late phase of prediction, when the final conclusion is drawn by comparing different machine learning models. This final phase, i.e. consensus learning, is trying to balance the generality of solution and the overall performance of trained model.

artificial intelligence, health & medicine, prediction, (19 more...)

arXiv.org Machine Learning

0910.0949

Country: Europe > Poland (0.28)

Genre: Research Report (0.82)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.68)

Add feedback

Functional learning through kernels

Canu, Stephane, Mary, Xavier, Rakotomamonjy, Alain

arXiv.org Machine LearningOct-6-2009

This paper reviews the functional aspects of statistical learning theory. The main point under consideration is the nature of the hypothesis set when no prior information is available but data. Within this framework we first discuss about the hypothesis set: it is a vectorial space, it is a set of pointwise defined functions, and the evaluation functional on this set is a continuous mapping. Based on these principles an original theory is developed generalizing the notion of reproduction kernel Hilbert space to non hilbertian sets. Then it is shown that the hypothesis set of any learning machine has to be a generalized reproducing set. Therefore, thanks to a general "representer theorem", the solution of the learning problem is still a linear combination of a kernel. Furthermore, a way to design these kernels is given. To illustrate this framework some examples of such reproducing sets and kernels are given.

artificial intelligence, kernel, machine learning, (18 more...)

arXiv.org Machine Learning

0910.1013

Country: Europe > France (0.14)

Genre:

Research Report (1.00)
Overview (0.74)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.87)

Add feedback

On the conditions used to prove oracle results for the Lasso

van de Geer, Sara A., Bühlmann, Peter

arXiv.org Machine LearningOct-5-2009

Oracle inequalities and variable selection properties for the Lasso in linear models have been established under a variety of different assumptions on the design matrix. We show in this paper how the different conditions and concepts relate to each other. The restricted eigenvalue condition (Bickel et al., 2009) or the slightly weaker compatibility condition (van de Geer, 2007) are sufficient for oracle results. We argue that both these conditions allow for a fairly general class of design matrices. Hence, optimality of the Lasso for prediction and estimation holds for more general situations than what it appears from coherence (Bunea et al, 2007b,c) or restricted isometry (Candes and Tao, 2005) assumptions.

artificial intelligence, condition hold, machine learning, (17 more...)

arXiv.org Machine Learning

doi: 10.1214/09-EJS506

0910.0722

Country: North America > United States (0.28)

Genre: Research Report (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

SparseCodePicking: feature extraction in mass spectrometry using sparse coding algorithms

Alexandrov, Theodore, Steinhorst, Klaus, Keszoecze, Oliver, Schiffler, Stefan

arXiv.org Machine LearningOct-5-2009

Mass spectrometry (MS) is an important technique for chemical profiling which calculates for a sample a high dimensional histogram-like spectrum. A crucial step of MS data processing is the peak picking which selects peaks containing information about molecules with high concentrations which are of interest in an MS investigation. We present a new procedure of the peak picking based on a sparse coding algorithm. Given a set of spectra of different classes, i.e. with different positions and heights of the peaks, this procedure can extract peaks by means of unsupervised learning. Instead of an $l_1$-regularization penalty term used in the original sparse coding algorithm we propose using an elastic-net penalty term for better regularization. The evaluation is done by means of simulation. We show that for a large region of parameters the proposed peak picking method based on the sparse coding features outperforms a mean spectrum-based method. Moreover, we demonstrate the procedure applying it to two real-life datasets.

health & medicine, oncology, procedure, (19 more...)

arXiv.org Machine Learning

0907.3426

Country: Europe > Germany > Bremen > Bremen (0.28)

Genre: Research Report (0.64)

Industry: Health & Medicine > Therapeutic Area > Oncology (0.47)

Technology:

Information Technology > Data Science > Data Mining > Feature Extraction (0.42)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis (0.32)

Add feedback

Statistical Decision Making for Authentication and Intrusion Detection

Dimitrakakis, Christos, Mitrokotsa, Aikaterini

arXiv.org Machine LearningOct-5-2009

User authentication and intrusion detection differ from standard classification problems in that while we have data generated from legitimate users, impostor or intrusion data is scarce or non-existent. We review existing techniques for dealing with this problem and propose a novel alternative based on a principled statistical decision-making view point. We examine the technique on a toy problem and validate it on complex real-world data from an RFID based access control system. The results indicate that it can significantly outperform the classical world model approach. The method could be more generally useful in other decision-making scenarios where there is a lack of adversary data.

adversary model, law enforcement, public safety, (19 more...)

arXiv.org Machine Learning

0910.0483

Country: North America > United States > California (0.14)

Genre: Research Report (0.82)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
Information Technology > Data Science > Data Mining (0.94)

Add feedback