A Permutation Approach for Selecting the Penalty Parameter in Penalized Model Selection

Sabourin, Jeremy, Valdar, William, Nobel, Andrew

Apr-8-2014–arXiv.org Machine Learning

The analysis of high dimensional data, in which the number of measured predictors is large and can exceed the number of samples, is an important and common problem in statistical applications. When samples are accompanied by a real or categorical response, data analysis typically includes model fitting with the aim of doing prediction or variable selection, or both. The goal of prediction is to derive a rule capable of accurately predicting the response of a new, unlabeled sample. The goal of variable selection is to select a (small) subset of the measured predictors whose individual or coordinated activity is significantly related to the response. In both cases, it is common to assume that the observed data arise from an underlying model that is sparse, in the sense that only a small subset of the predictors are related to the response. Whether sparsity is assumed, or viewed as a desirable feature of a model, analysis of high dimensional data is often carried out by penalized methods that produce models in which a relatively small subset of the available predictors are included. Popular penalized methods include the LASSO (Tibshirani, 1996), its numerous variations, and SCAD (Fan and Li, 2001). In what follows, we focus our attention on the LASSO. The LASSO and its variants require specification of a penalty/tuning parameter that controls the tradeoff between model fit and model size.

artificial intelligence, machine learning, selection, (18 more...)

arXiv.org Machine Learning

Apr-8-2014

arXiv.org PDF

Add feedback

Country:
- North America > United States
  - North Carolina > Orange County
    - Chapel Hill (0.14)
  - California > Santa Clara County
    - Palo Alto (0.04)

Genre:
- Research Report > Experimental Study (0.48)

Industry:
- Health & Medicine > Therapeutic Area > Oncology (0.46)

Technology:
- Information Technology > Artificial Intelligence > Machine Learning
  - Performance Analysis > Accuracy (0.68)
  - Statistical Learning > Regression (0.48)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found