AITopics | Regression

Collaborating Authors

Regression

News Overviews Instructional Materials AI-Alerts Classics

Sample-efficient Nonstationary Policy Evaluation for Contextual Bandits

Dudik, Miroslav, Erhan, Dumitru, Langford, John, Li, Lihong

arXiv.org Machine LearningOct-16-2012

We present and prove properties of a new offline policy evaluator for an exploration learning setting which is superior to previous evaluators. In particular, it simultaneously and correctly incorporates techniques from importance weighting, doubly robust evaluation, and nonstationary policy evaluation approaches. In addition, our approach allows generating longer histories by careful control of a bias-variance tradeoff, and further decreases variance by incorporating information about randomness of the target policy. Empirical evidence from synthetic and realworld exploration learning problems shows the new evaluator successfully unifies previous approaches and uses information an order of magnitude more efficiently.

artificial intelligence, evaluator, machine learning, (15 more...)

arXiv.org Machine Learning

1210.4862

Country: North America > United States (0.46)

Genre: Research Report > Experimental Study (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.46)

Add feedback

Local optima networks and the performance of iterated local search

Daolio, Fabio, Verel, Sébastien, Ochoa, Gabriela, Tomassini, Marco

arXiv.org Artificial IntelligenceOct-15-2012

Local Optima Networks (LONs) have been recently proposed as an alternative model of combinatorial fitness landscapes. The model compresses the information given by the whole search space into a smaller mathematical object that is the graph having as vertices the local optima and as edges the possible weighted transitions between them. A new set of metrics can be derived from this model that capture the distribution and connectivity of the local optima in the underlying configuration space. This paper departs from the descriptive analysis of local optima networks, and actively studies the correlation between network features and the performance of a local search heuristic. The NK family of landscapes and the Iterated Local Search metaheuristic are considered. With a statistically-sound approach based on multiple linear regression, it is shown that some LONs' features strongly influence and can even partly predict the performance of a heuristic search algorithm. This study validates the expressive power of LONs as a model of combinatorial fitness landscapes.

artificial intelligence, landscape, machine learning, (16 more...)

arXiv.org Artificial Intelligence

doi: 10.1145/2330163.2330217

1210.3946

Country:

Europe > Switzerland > Vaud > Lausanne (0.04)
North America > United States > New York > New York County > New York City (0.04)
North America > United States > California > Ventura County > Thousand Oaks (0.04)
(3 more...)

Genre: Research Report (1.00)

Industry:

Telecommunications > Networks (0.34)
Information Technology > Networks (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.50)

Add feedback

Group Model Selection Using Marginal Correlations: The Good, the Bad and the Ugly

Bajwa, Waheed U., Mixon, Dustin G.

arXiv.org Machine LearningOct-8-2012

Group model selection is the problem of determining a small subset of groups of predictors (e.g., the expression data of genes) that are responsible for majority of the variation in a response variable (e.g., the malignancy of a tumor). This paper focuses on group model selection in high-dimensional linear models, in which the number of predictors far exceeds the number of samples of the response variable. Existing works on high-dimensional group model selection either require the number of samples of the response variable to be significantly larger than the total number of predictors contributing to the response or impose restrictive statistical priors on the predictors and/or nonzero regression coefficients. This paper provides comprehensive understanding of a low-complexity approach to group model selection that avoids some of these limitations. The proposed approach, termed Group Thresholding (GroTh), is based on thresholding of marginal correlations of groups of predictors with the response variable and is reminiscent of existing thresholding-based approaches in the literature. The most important contribution of the paper in this regard is relating the performance of GroTh to a polynomial-time verifiable property of the predictors for the general case of arbitrary (random or deterministic) predictors and arbitrary nonzero regression coefficients.

artificial intelligence, machine learning, predictor, (13 more...)

arXiv.org Machine Learning

doi: 10.1109/Allerton.2012.6483259

1210.244

Country: North America > United States (0.46)

Genre: Research Report (0.50)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.70)

Add feedback

Feature Subset Selection for Software Cost Modelling and Estimation

Papatheocharous, Efi, Papadopoulos, Harris, Andreou, Andreas S.

arXiv.org Artificial IntelligenceOct-3-2012

Feature selection has been recently used in the area of software engineering for improving the accuracy and robustness of software cost models. The idea behind selecting the most informative subset of features from a pool of available cost drivers stems from the hypothesis that reducing the dimensionality of datasets will significantly minimise the complexity and time required to reach to an estimation using a particular modelling technique. This work investigates the appropriateness of attributes, obtained from empirical project databases and aims to reduce the cost drivers used while preserving performance. Finding suitable subset selections that may cater improved predictions may be considered as a pre-processing step of a particular technique employed for cost estimation (filter or wrapper) or an internal (embedded) step to minimise the fitting error. This paper compares nine relatively popular feature selection methods and uses the empirical values of selected attributes recorded in the ISBSG and Desharnais datasets to estimate software development effort.

algorithm, dataset, estimation, (14 more...)

arXiv.org Artificial Intelligence

1210.1161

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Europe > Middle East > Cyprus > Nicosia > Nicosia (0.04)
Europe > Middle East > Cyprus > Larnaka > Larnaca (0.04)
(19 more...)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Software Engineering (1.00)
Information Technology > Software (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
(2 more...)

Add feedback

Smooth Sparse Coding via Marginal Regression for Learning Sparse Representations

Balasubramanian, Krishnakumar, Yu, Kai, Lebanon, Guy

arXiv.org Machine LearningOct-3-2012

We propose and analyze a novel framework for learning sparse representations, based on two statistical techniques: kernel smoothing and marginal regression. The proposed approach provides a flexible framework for incorporating feature similarity or temporal information present in data sets, via non-parametric kernel smoothing. We provide generalization bounds for dictionary learning using smooth sparse coding and show how the sample complexity depends on the L1 norm of kernel function used. Furthermore, we propose using marginal regression for obtaining sparse codes, which significantly improves the speed and allows one to scale to large dictionary sizes easily. We demonstrate the advantages of the proposed approach, both in terms of accuracy and speed by extensive experimentation on several real data sets. In addition, we demonstrate how the proposed approach could be used for improving semi-supervised sparse coding.

artificial intelligence, machine learning, sparse, (18 more...)

arXiv.org Machine Learning

1210.1121

Genre: Research Report (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.47)

Add feedback

Robust Parametric Classification and Variable Selection by a Minimum Distance Criterion

Chi, Eric C., Scott, David W.

arXiv.org Machine LearningSep-29-2012

We investigate a robust penalized logistic regression algorithm based on a minimum distance criterion. Influential outliers are often associated with the explosion of parameter vector estimates, but in the context of standard logistic regression, the bias due to outliers always causes the parameter vector to implode, that is shrink towards the zero vector. Thus, using LASSO-like penalties to perform variable selection in the presence of outliers can result in missed detections of relevant covariates. We show that by choosing a minimum distance criterion together with an Elastic Net penalty, we can simultaneously find a parsimonious model and avoid estimation implosion even in the presence of many outliers in the important small $n$ large $p$ situation. Implementation using an MM algorithm is described and performance evaluated.

algorithm, artificial intelligence, machine learning, (17 more...)

arXiv.org Machine Learning

doi: 10.1080/10618600.2012.737296

1109.609

Country:

North America > United States > Texas (0.28)
North America > United States > California > Los Angeles County > Los Angeles (0.28)

Genre:

Research Report > New Finding (0.87)
Research Report > Experimental Study (0.70)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (1.00)

Add feedback

Partial Gaussian Graphical Model Estimation

Yuan, Xiao-Tong, Zhang, Tong

arXiv.org Machine LearningSep-28-2012

For such Gaussian graphical models (GGMs), it is usually assumed that a given variable can bepredicted by a small numberof other variables. This assumption implies that the precision matrix is sparse. Therefore estimating Gaussian graphical model can be reduced to the problem of estimating a sparse precision matrix. One approach to sparse precision matrix estimation is covariance selection or neighborhood selection (Dempster, 1972; Meinshausen & Bühlmann, 2006), which tries to estimate each row (or column) of the precision matrix by predicting the corresponding variable using a sparse linear combination of other variables. An alternative formulation is maximum-likelihood estimation method that directly estimate the full precision matrix.

artificial intelligence, bayesian inference, machine learning, (18 more...)

arXiv.org Machine Learning

1209.6419

Country:

Europe (1.00)
North America > United States > Illinois (0.14)

Genre: Research Report > New Finding (0.46)

Industry: Health & Medicine (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.47)

Add feedback

Tree-guided group lasso for multi-response regression with structured sparsity, with an application to eQTL mapping

Kim, Seyoung, Xing, Eric P.

arXiv.org Machine LearningSep-28-2012

We consider the problem of estimating a sparse multi-response regression function, with an application to expression quantitative trait locus (eQTL) mapping, where the goal is to discover genetic variations that influence gene-expression levels. In particular, we investigate a shrinkage technique capable of capturing a given hierarchical structure over the responses, such as a hierarchical clustering tree with leaf nodes for responses and internal nodes for clusters of related responses at multiple granularity, and we seek to leverage this structure to recover covariates relevant to each hierarchically-defined cluster of responses. We propose a tree-guided group lasso, or tree lasso, for estimating such structured sparsity under multi-response regression by employing a novel penalty function constructed from the tree. We describe a systematic weighting scheme for the overlapping groups in the tree-penalty such that each regression coefficient is penalized in a balanced manner despite the inhomogeneous multiplicity of group memberships of the regression coefficients due to overlaps among groups. For efficient optimization, we employ a smoothing proximal gradient method that was originally developed for a general class of structured-sparsity-inducing penalties. Using simulated and yeast data sets, we demonstrate that our method shows a superior performance in terms of both prediction errors and recovery of true sparsity patterns, compared to other methods for learning a multivariate-response regression.

artificial intelligence, machine learning, regression coefficient, (15 more...)

arXiv.org Machine Learning

doi: 10.1214/12-AOAS549

0909.1373

Country: North America > United States > Pennsylvania (0.28)

Genre: Research Report (1.00)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.69)

Add feedback

High-dimensional regression with noisy and missing data: Provable guarantees with nonconvexity

Loh, Po-Ling, Wainwright, Martin J.

arXiv.org Machine LearningSep-25-2012

Although the standard formulations of prediction problems involve fully-observed and noiseless data drawn in an i.i.d. manner, many applications involve noisy and/or missing data, possibly involving dependence, as well. We study these issues in the context of high-dimensional sparse linear regression, and propose novel estimators for the cases of noisy, missing and/or dependent data. Many standard approaches to noisy or missing data, such as those using the EM algorithm, lead to optimization problems that are inherently nonconvex, and it is difficult to establish theoretical guarantees on practical algorithms. While our approach also involves optimizing nonconvex programs, we are able to both analyze the statistical error associated with any global optimum, and more surprisingly, to prove that a simple algorithm based on projected gradient descent will converge in polynomial time to a small neighborhood of the set of all global minimizers. On the statistical side, we provide nonasymptotic bounds that hold with high probability for the cases of noisy, missing and/or dependent data. On the computational side, we prove that under the same types of conditions required for statistical consistency, the projected gradient descent algorithm is guaranteed to converge at a geometric rate to a near-global minimizer. We illustrate these theoretical predictions with simulations, showing close agreement with the predicted scalings.

artificial intelligence, estimator, machine learning, (18 more...)

arXiv.org Machine Learning

doi: 10.1214/12-AOS1018

1109.3714

Country: North America > United States > California (0.28)

Genre: Research Report (0.64)

Industry: Government > Regional Government > North America Government > United States Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.56)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.34)

Add feedback

Scaling Multidimensional Inference for Structured Gaussian Processes

Gilboa, Elad, Saatçi, Yunus, Cunningham, John P.

arXiv.org Machine LearningSep-21-2012

Exact Gaussian Process (GP) regression has O(N^3) runtime for data size N, making it intractable for large N. Many algorithms for improving GP scaling approximate the covariance with lower rank matrices. Other work has exploited structure inherent in particular covariance functions, including GPs with implied Markov structure, and equispaced inputs (both enable O(N) runtime). However, these GP advances have not been extended to the multidimensional input setting, despite the preponderance of multidimensional applications. This paper introduces and tests novel extensions of structured GPs to multidimensional inputs. We present new methods for additive GPs, showing a novel connection between the classic backfitting method and the Bayesian framework. To achieve optimal accuracy-complexity tradeoff, we extend this model with a novel variant of projection pursuit regression. Our primary result -- projection pursuit Gaussian Process Regression -- shows orders of magnitude speedup while preserving high accuracy. The natural second and third steps include non-Gaussian observations and higher dimensional equispaced grid methods. We introduce novel techniques to address both of these necessary directions. We thoroughly illustrate the power of these three advances on several datasets, achieving close performance to the naive Full GP at orders of magnitude less cost.

artificial intelligence, bayesian inference, machine learning, (20 more...)

arXiv.org Machine Learning

1209.412

Genre: Research Report > Promising Solution (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.66)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.48)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.46)

Add feedback