Goto

Collaborating Authors

 fault sensitivity


Adaptive Soft Error Protection for Deep Learning

arXiv.org Artificial Intelligence

The rising incidence of soft errors in hardware systems represents a considerable risk to the reliability of deep learning systems and can precipitate severe malfunctions. Although essential, soft error mitigation can impose substantial costs on deep learning systems that are inherently demanding in terms of computation and memory. Previous research has primarily explored variations in vulnerability among different components of computing engines or neural networks, aiming for selective protection to minimize protection overhead. Our approach diverges from these studies by recognizing that the susceptibility of deep learning tasks to soft errors is heavily input-dependent. Notably, some inputs are simpler for deep learning models and inherently exhibit greater tolerance to soft errors. Conversely, more complex inputs are prone to soft error impact. Based on these insights, we introduce an adaptive soft error protection strategy that tailors protection to the computational demands of individual inputs. To implement this strategy, we develop a metric for assessing the complexity of inputs and deploy a lightweight machine learning algorithm to gauge input difficulty. Subsequently, we employ robust protection for challenging inputs and minimal protection for simpler ones. Our experimental evaluation across diverse datasets and deep learning tasks reveals that our adaptive strategy reduces the soft error protection overhead by an average of 46.9%, without compromising system reliability.


Operational Fault Tolerance of CMAC Networks

Neural Information Processing Systems

The performance sensitivity of Albus' CMAC network was studied for the scenario in which faults are introduced into the adjustable weights after training has been accomplished. It was found that fault sensitivity was reduced with increased generalization when "loss of weight" faults were considered, but sensitivity was increased for "saturated weight" faults. 1 INTRODUCTION Fault-tolerance is often cited as an inherent property of neural networks, and is thought by many to be a natural consequence of "massively parallel" computational architectures. Numerous anecdotal reports of fault-tolerance experiments, primarily in pattern classification tasks, abound in the literature. However, there has been surprisingly little rigorous investigation of the fault-tolerance properties of various network architectures in other application areas. In this paper we investigate the fault-tolerance of the CMAC (Cerebellar Model Arithmetic Computer) network [Albus 1975] in a systematic manner. CMAC networks have attracted much recent attention because of their successful application in robotic manipulator control [Ersu 1984, Miller 1986, Lane 1988].


Operational Fault Tolerance of CMAC Networks

Neural Information Processing Systems

The performance sensitivity of Albus' CMAC network was studied for the scenario in which faults are introduced into the adjustable weights after training has been accomplished. It was found that fault sensitivity was reduced with increased generalization when "loss of weight" faults were considered, but sensitivity was increased for "saturated weight" faults. 1 INTRODUCTION Fault-tolerance is often cited as an inherent property of neural networks, and is thought by many to be a natural consequence of "massively parallel" computational architectures. Numerous anecdotal reports of fault-tolerance experiments, primarily in pattern classification tasks, abound in the literature. However, there has been surprisingly little rigorous investigation of the fault-tolerance properties of various network architectures in other application areas. In this paper we investigate the fault-tolerance of the CMAC (Cerebellar Model Arithmetic Computer) network [Albus 1975] in a systematic manner. CMAC networks have attracted much recent attention because of their successful application in robotic manipulator control [Ersu 1984, Miller 1986, Lane 1988].


Operational Fault Tolerance of CMAC Networks

Neural Information Processing Systems

The performance sensitivity of Albus' CMAC network was studied for the scenario in which faults are introduced into the adjustable weights after training has been accomplished. It was found that fault sensitivity was reduced with increased generalization when "loss of weight" faults were considered, but sensitivity was increased for "saturated weight" faults. 1 INTRODUCTION Fault-tolerance is often cited as an inherent property of neural networks, and is thought by many to be a natural consequence of "massively parallel" computational architectures. Numerous anecdotal reports of fault-tolerance experiments, primarily in pattern classification tasks, abound in the literature. However, there has been surprisingly little rigorous investigation of the fault-tolerance properties of various network architectures in other application areas. In this paper we investigate the fault-tolerance of the CMAC (Cerebellar Model Arithmetic Computer) network [Albus 1975] in a systematic manner. CMAC networks have attracted much recent attention because of their successful application in robotic manipulator control [Ersu 1984, Miller 1986, Lane 1988]. Since fault-tolerance is a key concern in critical control tasks, there is added impetus to study Operational Fault Tolerance of CMAC Networks 341 this aspect of CMAC performance.