Noisy Machines: Understanding Noisy Neural Networks and Enhancing Robustness to Analog Hardware Errors Using Distillation
Zhou, Chuteng, Kadambi, Prad, Mattina, Matthew, Whatmough, Paul N.
–arXiv.org Artificial Intelligence
A BSTRACT The success of deep learning has brought forth a wave of interest in computer hardware design to better meet the high demands of neural network inference. In particular, analog computing hardware has been heavily motivated specifically for accelerating neural networks, based on either electronic, optical or photonic devices, which may well achieve lower power consumption than conventional digital electronics. However, these proposed analog accelerators suffer from the intrinsic noise generated by their physical components, which makes it challenging to achieve high accuracy on deep neural networks. Hence, for successful deployment on analog accelerators, it is essential to be able to train deep neural networks to be robust to random continuous noise in the network weights, which is a somewhat new challenge in machine learning. In this paper, we advance the understanding of noisy neural networks. We outline how a noisy neural network has reduced learning capacity as a result of loss of mutual information between its input and output. To combat this, we propose using knowledge distillation combined with noise injection during training to achieve more noise robust networks, which is demonstrated experimentally across different networks and datasets, including ImageNet. Our method achieves models with as much as 2 greater noise tolerance compared with the previous best attempts, which is a significant step towards making analog hardware practical for deep learning. However, DNN inference is typically very demanding in terms of compute and memory resources Li et al. (2019). Consequently, larger models are often not well suited for large-scale deployment on edge devices, which typically have meagre performance and power budgets, especially battery powered mobile and IoT devices. To address these issues, the design of specialized hardware for DNN inference has drawn great interest, and is an extremely active area of research (Whatmough et al., 2019). To date, a plethora of techniques have been proposed for designing efficient neural network hardware (Sze et al., 2017; Whatmough et al., 2019).
arXiv.org Artificial Intelligence
Jan-14-2020
- Country:
- North America > United States
- Massachusetts > Suffolk County
- Boston (0.04)
- Florida > Broward County
- Deerfield Beach (0.04)
- California > Santa Clara County
- Stanford (0.04)
- Arizona > Maricopa County
- Tempe (0.04)
- Massachusetts > Suffolk County
- North America > United States
- Genre:
- Research Report (1.00)
- Industry:
- Information Technology (0.48)
- Energy (0.48)
- Technology: