AITopics | Statistical Learning

Collaborating Authors

Statistical Learning

News Overviews Instructional Materials AI-Alerts Classics

Constrained Optimization for a Subset of the Gaussian Parsimonious Clustering Models

Browne, Ryan P., Subedi, Sanjeena, McNicholas, Paul

arXiv.org Machine LearningJun-24-2013

The expectation-maximization (EM) algorithm is an iterative method for finding maximum likelihood estimates when data are incomplete or are treated as being incomplete. The EM algorithm and its variants are commonly used for parameter estimation in applications of mixture models for clustering and classification. This despite the fact that even the Gaussian mixture model likelihood surface contains many local maxima and is singularity riddled. Previous work has focused on circumventing this problem by constraining the smallest eigenvalue of the component covariance matrices. In this paper, we consider constraining the smallest eigenvalue, the largest eigenvalue, and both the smallest and largest within the family setting. Specifically, a subset of the GPCM family is considered for model-based clustering, where we use a re-parameterized version of the famous eigenvalue decomposition of the component covariance matrices. Our approach is illustrated using various experiments with simulated and real data.

algorithm, artificial intelligence, machine learning, (15 more...)

arXiv.org Machine Learning

1306.5824

Country:

Europe > Austria (0.28)
North America > Canada > Ontario (0.14)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.49)

Add feedback

The AdaBoost Flow

Lykov, A., Muzychka, S., Vaninsky, K.

arXiv.org Machine LearningJun-24-2013

We introduce a dynamical system which we call the AdaBoost flow. The flow is defined by a system of ODEs with control. We show that three algorithms of the AdaBoost family (i) the AdaBoost algorithm of Schapire and Freund (ii) the arc-gv algorithm of Breiman (iii) the confidence rated prediction of Schapire and Singer can be can be embedded in the AdaBoost flow. The nontrivial part of the AdaBoost flow equations coincides with the equations of dynamics of nonperiodic Toda system written in terms of spectral variables. We provide a novel invariant geometrical description of the AdaBoost algorithm as a gradient flow on a foliation defined by level sets of the potential function. We propose a new approach for constructing boosting algorithms as a continuous time gradient flow on measures defined by various metrics and potential functions. Finally we explain similarity of the AdaBoost algorithm with the Perelman's construction for the Ricci flow.

algorithm, artificial intelligence, machine learning, (18 more...)

arXiv.org Machine Learning

1110.6228

Country: North America > United States > Michigan (0.28)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (0.90)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.76)

Add feedback

Loss-Proportional Subsampling for Subsequent ERM

Mineiro, Paul, Karampatziakis, Nikos

arXiv.org Machine LearningJun-23-2013

We propose a sampling scheme suitable for reducing a data set prior to selecting a hypothesis with minimum empirical risk. The sampling only considers a subset of the ultimate (unknown) hypothesis set, but can nonetheless guarantee that the final excess risk will compare favorably with utilizing the entire original data set. We demonstrate the practical benefits of our approach on a large dataset which we subsample and subsequently fit with boosted trees.

artificial intelligence, machine learning, natural language, (18 more...)

arXiv.org Machine Learning

1306.184

Country: North America > United States (0.46)

Genre: Research Report (0.84)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.30)

Add feedback

A Statistical Perspective on Algorithmic Leveraging

Ma, Ping, Mahoney, Michael W., Yu, Bin

arXiv.org Machine LearningJun-22-2013

One popular method for dealing with large-scale data sets is sampling. For example, by using the empirical statistical leverage scores as an importance sampling distribution, the method of algorithmic leveraging samples and rescales rows/columns of data matrices to reduce the data size before performing computations on the subproblem. This method has been successful in improving computational efficiency of algorithms for matrix problems such as least-squares approximation, least absolute deviations approximation, and low-rank matrix approximation. Existing work has focused on algorithmic issues such as worst-case running times and numerical issues associated with providing high-quality implementations, but none of it addresses statistical aspects of this method. In this paper, we provide a simple yet effective framework to evaluate the statistical properties of algorithmic leveraging in the context of estimating parameters in a linear regression model with a fixed number of predictors. We show that from the statistical perspective of bias and variance, neither leverage-based sampling nor uniform sampling dominates the other. This result is particularly striking, given the well-known result that, from the algorithmic perspective of worst-case analysis, leverage-based sampling provides uniformly superior worst-case algorithmic results, when compared with uniform sampling. Based on these theoretical results, we propose and analyze two new leveraging algorithms. A detailed empirical evaluation of existing leverage-based methods as well as these two new methods is carried out on both synthetic and real data sets. The empirical results indicate that our theory is a good predictor of practical performance of existing and new leverage-based algorithms and that the new algorithms achieve improved performance.

artificial intelligence, leverage score, machine learning, (17 more...)

arXiv.org Machine Learning

1306.5362

Country: North America > United States > California (0.45)

Genre: Research Report > New Finding (0.46)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.68)

Add feedback

Locally adaptive factor processes for multivariate time series

Durante, Daniele, Scarpa, Bruno, Dunson, David B.

arXiv.org Machine LearningJun-21-2013

In modeling multivariate time series, it is important to allow time-varying smoothness in the mean and covariance process. In particular, there may be certain time intervals exhibiting rapid changes and others in which changes are slow. If such time-varying smoothness is not accounted for, one can obtain misleading inferences and predictions, with over-smoothing across erratic time intervals and under-smoothing across times exhibiting slow variation. This can lead to mis-calibration of predictive intervals, which can be substantially too narrow or wide depending on the time. We propose a locally adaptive factor process for characterizing multivariate mean-covariance changes in continuous time, allowing locally varying smoothness in both the mean and covariance matrix. This process is constructed utilizing latent dictionary functions evolving in time through nested Gaussian processes and linearly related to the observed data with a sparse mapping. Using a differential equation representation, we bypass usual computational bottlenecks in obtaining MCMC and online algorithms for approximate Bayesian inference. The performance is assessed in simulations and illustrated in a financial application.

artificial intelligence, machine learning, modeling & simulation, (19 more...)

arXiv.org Machine Learning

1210.2022

Country:

North America > United States (1.00)
Europe (1.00)
Asia (0.68)

Genre: Research Report (0.50)

Industry:

Health & Medicine (1.00)
Banking & Finance > Trading (0.93)
Government > Regional Government > North America Government > United States Government (0.46)

Technology:

Information Technology > Modeling & Simulation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.87)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Add feedback

Hacking Smart Machines with Smarter Ones: How to Extract Meaningful Data from Machine Learning Classifiers

Ateniese, Giuseppe, Felici, Giovanni, Mancini, Luigi V., Spognardi, Angelo, Villani, Antonio, Vitali, Domenico

arXiv.org Machine LearningJun-19-2013

Machine Learning (ML) algorithms are used to train computers to perform a variety of complex tasks and improve with experience. Computers learn how to recognize patterns, make unintended decisions, or react to a dynamic environment. Certain trained machines may be more effective than others because they are based on more suitable ML algorithms or because they were trained through superior training sets. Although ML algorithms are known and publicly released, training sets may not be reasonably ascertainable and, indeed, may be guarded as trade secrets. While much research has been performed about the privacy of the elements of training sets, in this paper we focus our attention on ML classifiers and on the statistical information that can be unconsciously or maliciously revealed from them. We show that it is possible to infer unexpected but useful information from ML classifiers. In particular, we build a novel meta-classifier and train it to hack other classifiers, obtaining meaningful information about their training sets. This kind of information leakage can be exploited, for example, by a vendor to build more effective classifiers or to simply acquire trade secrets from a competitor's apparatus, potentially violating its intellectual property rights.

artificial intelligence, classifier, machine learning, (17 more...)

arXiv.org Machine Learning

1306.4447

Country: North America > United States > California (0.28)

Genre:

Research Report > New Finding (1.00)
Instructional Material > Course Syllabus & Notes (0.66)

Industry:

Law > Intellectual Property & Technology Law (1.00)
Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.70)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.68)
(3 more...)

Add feedback

A lasso for hierarchical interactions

Bien, Jacob, Taylor, Jonathan, Tibshirani, Robert

arXiv.org Machine LearningJun-19-2013

We add a set of convex constraints to the lasso to produce sparse interaction models that honor the hierarchy restriction that an interaction only be included in a model if one or both variables are marginally important. We give a precise characterization of the effect of this hierarchy constraint, prove that hierarchy holds with probability one and derive an unbiased estimate for the degrees of freedom of our estimator. A bound on this estimate reveals the amount of fitting "saved" by the hierarchy constraint. We distinguish between parameter sparsity - the number of nonzero coefficients - and practical sparsity - the number of raw variables one must measure to make a new prediction. Hierarchy focuses on the latter, which is more closely tied to important data collection concerns such as cost, time and effort. We develop an algorithm, available in the R package hierNet, and perform an empirical study of our method.

artificial intelligence, interaction, machine learning, (19 more...)

arXiv.org Machine Learning

doi: 10.1214/13-AOS1096

1205.505

Country: North America > United States > California > Santa Clara County (0.28)

Genre: Research Report (1.00)

Industry:

Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)
Health & Medicine > Therapeutic Area > Immunology (1.00)
Health & Medicine > Epidemiology (0.93)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)

Add feedback

Machine Learning with Operational Costs

Tulabandhula, Theja, Rudin, Cynthia

arXiv.org Machine LearningJun-18-2013

This work proposes a way to align statistical modeling with decision making. We provide a method that propagates the uncertainty in predictive modeling to the uncertainty in operational cost, where operational cost is the amount spent by the practitioner in solving the problem. The method allows us to explore the range of operational costs associated with the set of reasonable statistical models, so as to provide a useful way for practitioners to understand uncertainty. To do this, the operational cost is cast as a regularization term in a learning algorithm's objective function, allowing either an optimistic or pessimistic view of possible costs, depending on the regularization parameter. From another perspective, if we have prior knowledge about the operational cost, for instance that it should be low, this knowledge can help to restrict the hypothesis space, and can help with generalization. We provide a theoretical generalization bound for this scenario. We also show that learning with operational costs is related to robust optimization.

artificial intelligence, constraint, machine learning, (21 more...)

arXiv.org Machine Learning

1112.0698

Country: North America > United States > Massachusetts (0.28)

Genre: Research Report (0.82)

Industry:

Energy > Power Industry (1.00)
Health & Medicine (0.92)
Transportation (0.67)
Banking & Finance (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning > Constraint-Based Reasoning (0.93)
(3 more...)

Add feedback

Bioclimating Modelling: A Machine Learning Perspective

Bhattacharya, Maumita

arXiv.org Machine LearningJun-18-2013

Many machine learning (ML) approaches are widely used to generate bioclimatic models for prediction of geographic range of organism as a function of climate. Applications such as prediction of range shift in organism, range of invasive species influenced by climate change are important parameters in understanding the impact of climate change. However, success of machine learning-based approaches depends on a number of factors. While it can be safely said that no particular ML technique can be effective in all applications and success of a technique is predominantly dependent on the application or the type of the problem, it is useful to understand their behaviour to ensure informed choice of techniques. This paper presents a comprehensive review of machine learning-based bioclimatic model generation and analyses the factors influencing success of such models. Considering the wide use of statistical techniques, in our discussion we also include conventional statistical techniques used in bioclimatic modelling.

artificial intelligence, evolutionary algorithm, machine learning, (18 more...)

arXiv.org Machine Learning

1306.4152

Country:

Europe (0.68)
North America > United States > California (0.28)

Genre:

Research Report (1.00)
Overview (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Evolutionary Systems (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.49)
Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (0.48)
(2 more...)

Add feedback

Joint estimation of sparse multivariate regression and conditional graphical models

Wang, Junhui

arXiv.org Machine LearningJun-18-2013

Multivariate regression model is a natural generalization of the classical univari- ate regression model for fitting multiple responses. In this paper, we propose a high- dimensional multivariate conditional regression model for constructing sparse estimates of the multivariate regression coefficient matrix that accounts for the dependency struc- ture among the multiple responses. The proposed method decomposes the multivariate regression problem into a series of penalized conditional log-likelihood of each response conditioned on the covariates and other responses. It allows simultaneous estimation of the sparse regression coefficient matrix and the sparse inverse covariance matrix. The asymptotic selection consistency and normality are established for the diverging dimension of the covariates and number of responses. The effectiveness of the pro- posed method is also demonstrated in a variety of simulated examples as well as an application to the Glioblastoma multiforme cancer data.

artificial intelligence, machine learning, regression model, (16 more...)

arXiv.org Machine Learning

1306.441

Country: North America > United States (0.46)

Genre: Research Report (0.40)

Industry: Health & Medicine > Therapeutic Area > Oncology (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (1.00)

Add feedback