Deep supervised feature selection using Stochastic Gates

Yamada, Yutaro, Lindenbaum, Ofir, Negahban, Sahand, Kluger, Yuval

arXiv.org Machine Learning 

In this study, we propose a novel non-parametric embedded feature selection method based on minimizing the $\ell_0$ norm of the vector of an indicator variable, whose point-wise product of an input selects a subset of features. Our approach relies on the continuous relaxation of Bernoulli distributions, which allows our model to learn the parameters of the approximate Bernoulli distributions via tractable methods. Using these tools we present a general neural network that simultaneously minimizes a loss function while selecting relevant features. We also provide an information-theoretic justification of incorporating Bernoulli distribution into our approach. Finally, we demonstrate the potential of the approach on synthetic and real-life applications.

Duplicate Docs Excel Report

Title
None found

Similar Docs  Excel Report  more

TitleSimilaritySource
None found