Accuracy and stability of solar variable selection comparison under complicated dependence structures

Xu, Ning

arXiv.org Machine Learning 

In this paper we focus on the variable-selection peformance of solar on the empirical data with complicated dependence structures and, hence, severe multicollinearity and grouping effect issues. We choose the prostate cancer data and the Sydney house price data and apply two lasso solvers, elastic net and solar on them (code can be found at \url{https://github.com/isaac2math/}). The results shows that (i) lasso is affected by the grouping effect and randomly drop variables with high correlations, resulting unreliable and uninterpretable results; (ii) elastic net is more robust to grouping effect; however, it completely lose variable-selection sparsity when the dependence structure of the data is complicated; (iii) solar demonstrates its superior robustness to complicated dependence structures and grouping effect, returning variable-selection results with better stability and sparsity. Also, such stability and sparsity make solar a reliable variable pre-estimation filter of a linear dependence structure esimation (linear probablistic graph learning). The linear probablistic graph estimated on the variable selected by solar returns an intuitive, sparse and stable dependence structure.

Duplicate Docs Excel Report

Title
None found

Similar Docs  Excel Report  more

TitleSimilaritySource
None found