reproducibility
DeepPINK: reproducible feature selection in deep neural networks
Deep learning has become increasingly popular in both supervised and unsupervised machine learning thanks to its outstanding empirical performance. However, because of their intrinsic complexity, most deep learning methods are largely treated as black box tools with little interpretability. Even though recent attempts have been made to facilitate the interpretability of deep neural networks (DNNs), existing methods are susceptible to noise and lack of robustness. Therefore, scientists are justifiably cautious about the reproducibility of the discoveries, which is often related to the interpretability of the underlying statistical models. In this paper, we describe a method to increase the interpretability and reproducibility of DNNs by incorporating the idea of feature selection with controlled error rate. By designing a new DNN architecture and integrating it with the recently proposed knockoffs framework, we perform feature selection with a controlled error rate, while maintaining high power. This new method, DeepPINK (Deep feature selection using Paired-Input Nonlinear Knockoffs), is applied to both simulated and real data sets to demonstrate its empirical utility.
- North America > Canada > Ontario (0.05)
- North America > Canada > Quebec > Montreal (0.04)
- North America > United States > California > Yolo County > Davis (0.14)
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
- Health & Medicine (0.94)
- Information Technology (0.67)
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (1.00)
- Leisure & Entertainment > Games > Chess (1.00)
- Information Technology (0.67)
- Education (0.67)
- Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.68)
- Information Technology > Artificial Intelligence > Natural Language > Generation (0.64)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)
- North America > United States > California > Orange County > Irvine (0.14)
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
- North America > Canada > Quebec > Montreal (0.04)
- (13 more...)
- Information Technology (1.00)
- Health & Medicine (1.00)
- Government > Regional Government > North America Government > United States Government (1.00)
Causes and Effects of Unanticipated Numerical Deviations in Neural Network Inference Frameworks
Hardware-specific optimizations in machine learning (ML) frameworks can cause numerical deviations of inference results. Quite surprisingly, despite using a fixed trained model and fixed input data, inference results are not consistent across platforms, and sometimes not even deterministic on the same platform. We study the causes of these numerical deviations for convolutional neural networks (CNN) on realistic end-to-end inference pipelines and in isolated experiments. Results from 75 distinct platforms suggest that the main causes of deviations on CPUs are differences in SIMD use, and the selection of convolution algorithms at runtime on GPUs. We link the causes and propagation effects to properties of the ML model and evaluate potential mitigations. We make our research code publicly available.
- North America > Canada > Ontario > Toronto (0.14)
- Europe > Austria > Tyrol > Innsbruck (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Europe > Switzerland > Zürich > Zürich (0.15)
- Europe > Switzerland > Vaud > Lausanne (0.04)
- North America > Canada > Quebec > Montreal (0.04)
- North America > United States > Maryland > Baltimore County (0.04)
- North America > United States > Maryland > Baltimore (0.04)
- Research Report > Experimental Study (1.00)
- Research Report > New Finding (0.94)