wide resnet
- North America > United States > Texas > Brazos County > College Station (0.14)
- North America > Canada > Quebec > Montreal (0.04)
- North America > United States > California (0.04)
- North America > Canada (0.04)
Regularizing Deep Neural Networks by Noise: Its Interpretation and Optimization
Hyeonwoo Noh, Tackgeun You, Jonghwan Mun, Bohyung Han
Injecting noises to hidden units during training, e.g., dropout, is known as a successful regularizer, but it is still not clear enough why such training techniques work well in practice and how we can maximize their benefit in the presence of two conflicting objectives--optimizing to true data distribution and preventing
- North America > Canada > Ontario > Toronto (0.14)
- North America > United States > California > Los Angeles County > Long Beach (0.04)
- North America > United States > Texas > Brazos County > College Station (0.14)
- North America > Canada > Quebec > Montreal (0.04)
A Appendix
This is simple to see as the ranks in the uneven depthwise are computed per input and the merging is done by output. The proposed RED method is summarized in algorithm 1. Note that we didn't describe the Strategy % removed parameters linear descending 77.90 constant 78.69 linear ascending 80.35 block 84.52 The constant strategy provides the best results. Following the study from Section 5.2, we want to empirically validate that hashing a DNN RED appears to be robust to dropout.
RED: Looking for Redundancies for Data-Free Structured Compression of Deep Neural Networks
Deep Neural Networks (DNNs) are ubiquitous in today's computer vision landscape, despite involving considerable computational costs. The mainstream approaches for runtime acceleration consist in pruning connections ( unstructured pruning) or, better, filters ( structured pruning), both often requiring data to retrain the model.
Optimizing Canaries for Privacy Auditing with Metagradient Descent
Boglioni, Matteo, Liu, Terrance, Ilyas, Andrew, Wu, Zhiwei Steven
In this work we study black-box privacy auditing, where the goal is to lower bound the privacy parameter of a differentially private learning algorithm using only the algorithm's outputs (i.e., final trained model). For DP-SGD (the most successful method for training differentially private deep learning models), the canonical approach auditing uses membership inference-an auditor comes with a small set of special "canary" examples, inserts a random subset of them into the training set, and then tries to discern which of their canaries were included in the training set (typically via a membership inference attack). The auditor's success rate then provides a lower bound on the privacy parameters of the learning algorithm. Our main contribution is a method for optimizing the auditor's canary set to improve privacy auditing, leveraging recent work on metagradient optimization. Our empirical evaluation demonstrates that by using such optimized canaries, we can improve empirical lower bounds for differentially private image classification models by over 2x in certain instances. Furthermore, we demonstrate that our method is transferable and efficient: canaries optimized for non-private SGD with a small model architecture remain effective when auditing larger models trained with DP-SGD.
- North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
- North America > United States > Utah > Salt Lake County > Salt Lake City (0.04)
- North America > United States > California > Santa Clara County > Palo Alto (0.04)
- (2 more...)
Regularizing Deep Neural Networks by Noise: Its Interpretation and Optimization
Hyeonwoo Noh, Tackgeun You, Jonghwan Mun, Bohyung Han
Overfitting is one of the most critical challenges in deep neural networks, and there are various types of regularization methods to improve generalization performance. Injecting noises to hidden units during training, e.g., dropout, is known as a successful regularizer, but it is still not clear enough why such training techniques work well in practice and how we can maximize their benefit in the presence of two conflicting objectives--optimizing to true data distribution and preventing overfitting by regularization. This paper addresses the above issues by 1) interpreting that the conventional training methods with regularization by noise injection optimize the lower bound of the true objective and 2) proposing a technique to achieve a tighter lower bound using multiple noise samples per training example in a stochastic gradient descent iteration. We demonstrate the effectiveness of our idea in several computer vision applications.
- North America > Canada > Ontario > Toronto (0.14)
- North America > United States > California > Los Angeles County > Long Beach (0.04)
EXACT: How to Train Your Accuracy
Karpukhin, Ivan, Dereka, Stanislav, Kolesnikov, Sergey
Classification tasks are usually evaluated in terms of accuracy. However, accuracy is discontinuous and cannot be directly optimized using gradient ascent. Popular methods minimize cross-entropy, hinge loss, or other surrogate losses, which can lead to suboptimal results. In this paper, we propose a new optimization framework by introducing stochasticity to a model's output and optimizing expected accuracy, i.e. accuracy of the stochastic model. Extensive experiments on linear models and deep image classification show that the proposed optimization method is a powerful alternative to widely used classification losses.
- North America > United States > Wisconsin (0.05)
- North America > United States > New York (0.04)
- North America > Canada > Alberta > Census Division No. 15 > Improvement District No. 9 > Banff (0.04)
- Information Technology > Artificial Intelligence > Natural Language (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.86)