deeppink
DeepPINK: reproducible feature selection in deep neural networks
Deep learning has become increasingly popular in both supervised and unsupervised machine learning thanks to its outstanding empirical performance. However, because of their intrinsic complexity, most deep learning methods are largely treated as black box tools with little interpretability. Even though recent attempts have been made to facilitate the interpretability of deep neural networks (DNNs), existing methods are susceptible to noise and lack of robustness. Therefore, scientists are justifiably cautious about the reproducibility of the discoveries, which is often related to the interpretability of the underlying statistical models. In this paper, we describe a method to increase the interpretability and reproducibility of DNNs by incorporating the idea of feature selection with controlled error rate. By designing a new DNN architecture and integrating it with the recently proposed knockoffs framework, we perform feature selection with a controlled error rate, while maintaining high power. This new method, DeepPINK (Deep feature selection using Paired-Input Nonlinear Knockoffs), is applied to both simulated and real data sets to demonstrate its empirical utility.
DeepPINK: reproducible feature selection in deep neural networks
Deep learning has become increasingly popular in both supervised and unsupervised machine learning thanks to its outstanding empirical performance. However, because of their intrinsic complexity, most deep learning methods are largely treated as black box tools with little interpretability. Even though recent attempts have been made to facilitate the interpretability of deep neural networks (DNNs), existing methods are susceptible to noise and lack of robustness. Therefore, scientists are justifiably cautious about the reproducibility of the discoveries, which is often related to the interpretability of the underlying statistical models. In this paper, we describe a method to increase the interpretability and reproducibility of DNNs by incorporating the idea of feature selection with controlled error rate. By designing a new DNN architecture and integrating it with the recently proposed knockoffs framework, we perform feature selection with a controlled error rate, while maintaining high power. This new method, DeepPINK (Deep feature selection using Paired-Input Nonlinear Knockoffs), is applied to both simulated and real data sets to demonstrate its empirical utility.
- North America > United States > California > Los Angeles County > Los Angeles (0.28)
- North America > United States > Washington > King County > Seattle (0.04)
- North America > Canada > Quebec > Montreal (0.04)
- North America > United States > California > Los Angeles County > Los Angeles (0.28)
- North America > United States > Washington > King County > Seattle (0.04)
- North America > Canada > Quebec > Montreal (0.04)
Reviews: DeepPINK: reproducible feature selection in deep neural networks
The paper proposes a method for feature selection in neural networks using a method for a controlled error rate, quantified through the False Discovery Rate. To control FDR the paper is using the model-X knockoffs framework [2, 3, 10]: construct random features that obey distributional properties with respect to the true features, and extract statistics (filters) of pairwise importance measures between true-knockoff dimensions. The choice of the importance function and the knockoff filters is flexible. The novelty of this paper lies in using a neural network (MLP) to get the importance measure through a linear layer that couples the true and knockoff features pairwise. The final statistic depends both on the trainable linear layer weights and the rest of the network weights.
Neural Gaussian Mirror for Controlled Feature Selection in Neural Networks
Xing, Xin, Gui, Yu, Dai, Chenguang, Liu, Jun S.
Deep neural networks (DNNs) have become increasingly popular and achieved outstanding performance in predictive tasks. However, the DNN framework itself cannot inform the user which features are more or less relevant for making the prediction, which limits its applicability in many scientific fields. We introduce neural Gaussian mirrors (NGMs), in which mirrored features are created, via a structured perturbation based on a kernel-based conditional dependence measure, to help evaluate feature importance. We design two modifications of the DNN architecture for incorporating mirrored features and providing mirror statistics to measure feature importance. As shown in simulated and real data examples, the proposed method controls the feature selection error rate at a predefined level and maintains a high selection power even with the presence of highly correlated features.
- North America > United States > Virginia (0.04)
- North America > United States > Illinois > Cook County > Chicago (0.04)
DeepPINK: reproducible feature selection in deep neural networks
Lu, Yang, Fan, Yingying, Lv, Jinchi, Noble, William Stafford
Deep learning has become increasingly popular in both supervised and unsupervised machine learning thanks to its outstanding empirical performance. However, because of their intrinsic complexity, most deep learning methods are largely treated as black box tools with little interpretability. Even though recent attempts have been made to facilitate the interpretability of deep neural networks (DNNs), existing methods are susceptible to noise and lack of robustness. Therefore, scientists are justifiably cautious about the reproducibility of the discoveries, which is often related to the interpretability of the underlying statistical models. In this paper, we describe a method to increase the interpretability and reproducibility of DNNs by incorporating the idea of feature selection with controlled error rate. By designing a new DNN architecture and integrating it with the recently proposed knockoffs framework, we perform feature selection with a controlled error rate, while maintaining high power. This new method, DeepPINK (Deep feature selection using Paired-Input Nonlinear Knockoffs), is applied to both simulated and real data sets to demonstrate its empirical utility.
- North America > United States > California > Los Angeles County > Los Angeles (0.28)
- North America > United States > Washington > King County > Seattle (0.14)
- North America > Canada > Quebec > Montreal (0.04)
DeepPINK: reproducible feature selection in deep neural networks
Lu, Yang, Fan, Yingying, Lv, Jinchi, Noble, William Stafford
Deep learning has become increasingly popular in both supervised and unsupervised machine learning thanks to its outstanding empirical performance. However, because of their intrinsic complexity, most deep learning methods are largely treated as black box tools with little interpretability. Even though recent attempts have been made to facilitate the interpretability of deep neural networks (DNNs), existing methods are susceptible to noise and lack of robustness. Therefore, scientists are justifiably cautious about the reproducibility of the discoveries, which is often related to the interpretability of the underlying statistical models. In this paper, we describe a method to increase the interpretability and reproducibility of DNNs by incorporating the idea of feature selection with controlled error rate. By designing a new DNN architecture and integrating it with the recently proposed knockoffs framework, we perform feature selection with a controlled error rate, while maintaining high power. This new method, DeepPINK (Deep feature selection using Paired-Input Nonlinear Knockoffs), is applied to both simulated and real data sets to demonstrate its empirical utility.
- North America > United States > California > Los Angeles County > Los Angeles (0.28)
- North America > United States > Washington > King County > Seattle (0.14)
- North America > Canada > Quebec > Montreal (0.04)
DeepPINK: reproducible feature selection in deep neural networks
Lu, Yang Young, Fan, Yingying, Lv, Jinchi, Noble, William Stafford
Deep learning has become increasingly popular in both supervised and unsupervised machine learning thanks to its outstanding empirical performance. However, because of their intrinsic complexity, most deep learning methods are largely treated as black box tools with little interpretability. Even though recent attempts have been made to facilitate the interpretability of deep neural networks (DNNs), existing methods are susceptible to noise and lack of robustness. Therefore, scientists are justifiably cautious about the reproducibility of the discoveries, which is often related to the interpretability of the underlying statistical models. In this paper, we describe a method to increase the interpretability and reproducibility of DNNs by incorporating the idea of feature selection with controlled error rate. By designing a new DNN architecture and integrating it with the recently proposed knockoffs framework, we perform feature selection with a controlled error rate, while maintaining high power. This new method, DeepPINK (Deep feature selection using Paired-Input Nonlinear Knockoffs), is applied to both simulated and real data sets to demonstrate its empirical utility.
- North America > United States > California > Los Angeles County > Los Angeles (0.28)
- North America > United States > Washington > King County > Seattle (0.14)
- North America > Canada > Quebec > Montreal (0.04)
How a Kalman filter works, in pictures
I have to tell you about the Kalman filter, because what it does is pretty damn amazing. Surprisingly few software engineers and scientists seem to know about it, and that makes me sad because it is such a general and powerful tool for combining information in the presence of uncertainty. At times its ability to extract accurate information seems almost magical-- and if it sounds like I'm talking this up too much, then take a look at this previously posted video where I demonstrate a Kalman filter figuring out the orientation of a free-floating body by looking at its velocity. You can use a Kalman filter in any place where you have uncertain information about some dynamic system, and you can make an educated guess about what the system is going to do next. Even if messy reality comes along and interferes with the clean motion you guessed about, the Kalman filter will often do a very good job of figuring out what actually happened.