tmr
NAPER: Fault Protection for Real-Time Resource-Constrained Deep Neural Networks
Rajagede, Rian Adam, Santriaji, Muhammad Husni, Fikriansyah, Muhammad Arya, Nuha, Hilal Hudan, Fu, Yanjie, Solihin, Yan
--Fault tolerance in Deep Neural Networks (DNNs) deployed on resource-constrained systems presents unique challenges for high-accuracy applications with strict timing requirements. Memory bit-flips can severely degrade DNN accuracy, while traditional protection approaches like Triple Modular Redundancy (TMR) often sacrifice accuracy to maintain reliability, creating a three-way dilemma between reliability, accuracy, and timeliness. We introduce NAPER, a novel protection approach that addresses this challenge through ensemble learning. Unlike conventional redundancy methods, NAPER employs heterogeneous model redundancy, where diverse models collectively achieve higher accuracy than any individual model. This is complemented by an efficient fault detection mechanism and a real-time scheduler that prioritizes meeting deadlines by intelligently scheduling recovery operations without interrupting inference. Our evaluations demonstrate NAPER's superiority: 40% faster inference in both normal and fault conditions, maintained accuracy 4.2% higher than TMR-based strategies, and guaranteed uninterrupted operation even during fault recovery. NAPER effectively balances the competing demands of accuracy, reliability, and timeliness in real-time DNN applications. Fault tolerance in real-time systems with limited computational resources, or resource-constrained systems, presents significant integration challenges. These systems have finite computational capabilities that cannot be easily expanded to accommodate redundancy and recovery without substantial trade-offs.
Few-Shot Pattern Detection via Template Matching and Regression
Jo, Eunchan, Kang, Dahyun, Kim, Sanghyun, Choi, Yunseon, Cho, Minsu
W e address the problem of few-shot pattern detection, which aims to detect all instances of a given pattern, typically represented by a few exemplars, from an input image. Although similar problems have been studied in few-shot object counting and detection (FSCD), previous methods and their benchmarks have narrowed patterns of interest to object categories and often fail to localize non-object patterns. In this work, we propose a simple yet effective detector based on template matching and regression, dubbed TMR. While previous FSCD methods typically represent target exemplars as spatially collapsed prototypes and lose structural information, we revisit classic template matching and regression. It effectively preserves and leverages the spatial layout of exemplars through a minimalistic structure with a small number of learnable convolutional or projection layers on top of a frozen backbone. W e also introduce a new dataset, dubbed RPINE, which covers a wider range of patterns than existing object-centric datasets. Our method outperforms the state-of-the-art methods on the three benchmarks, RPINE, FSCD-147, and FSCD-LVIS, and demonstrates strong generalization in cross-dataset evaluation.
Efficient Triple Modular Redundancy for Reliability Enhancement of DNNs Using Explainable AI
Soroush, Kimia, Shirazi, Nastaran, Raji, Mohsen
Deep Neural Networks (DNNs) are widely employed in safety-critical domains, where ensuring their reliability is essential. Triple Modular Redundancy (TMR) is an effective technique to enhance the reliability of DNNs in the presence of bit-flip faults. In order to handle the significant overhead of TMR, it is applied selectively on the parameters and components with the highest contribution at the model output. Hence, the accuracy of the selection criterion plays the key role on the efficiency of TMR. This paper presents an efficient TMR approach to enhance the reliability of DNNs against bit-flip faults using an Explainable Artificial Intelligence (XAI) method. Since XAI can provide valuable insights about the importance of individual neurons and weights in the performance of the network, they can be applied as the selection metric in TMR techniques. The proposed method utilizes a low-cost, gradient-based XAI technique known as Layer-wise Relevance Propagation (LRP) to calculate importance scores for DNN parameters. These scores are then used to enhance the reliability of the model, with the most critical weights being protected by TMR. The proposed approach is evaluated on two DNN models, VGG16 and AlexNet, using datasets such as MNIST and CIFAR-10. The results demonstrate that the method can protect the AlexNet model at a bit error rate of 10-4, achieving over 60% reliability improvement while maintaining the same overhead as state-of-the-art methods.
Controlling Difficulty of Generated Text for AI-Assisted Language Learning
Jin, Meiqing, Dugan, Liam, Callison-Burch, Chris
Practicing conversations with large language models (LLMs) presents a promising alternative to traditional in-person language learning. However, most LLMs generate text at a near-native level of complexity, making them ill-suited for beginner learners (CEFR: A1-A2). In this paper, we investigate whether controllable generation techniques -- specifically modular methods that do not require model fine-tuning -- can adapt LLM outputs to better support absolute beginners. We evaluate these methods through both automatic metrics and a user study with university-level learners of Japanese. Our findings show that while prompting alone fails to control output difficulty, the use of future discriminators (Yang and Klein, 2021) significantly improves output comprehensibility (from 40.4\% to 84.3\%). We further introduce a novel token-level evaluation metric, Token Miss Rate (TMR), that quantifies the proportion of incomprehensible tokens per utterance and correlates strongly with human judgments. To support future research in AI-assisted language learning, we release our code, models, annotation tools, and dataset.
Deep Privacy Funnel Model: From a Discriminative to a Generative Approach with an Application to Face Recognition
Razeghi, Behrooz, Rahimi, Parsa, Marcel, Sรฉbastien
In this study, we apply the information-theoretic Privacy Funnel (PF) model to the domain of face recognition, developing a novel method for privacy-preserving representation learning within an end-to-end training framework. Our approach addresses the trade-off between obfuscation and utility in data protection, quantified through logarithmic loss, also known as self-information loss. This research provides a foundational exploration into the integration of information-theoretic privacy principles with representation learning, focusing specifically on the face recognition systems. We particularly highlight the adaptability of our framework with recent advancements in face recognition networks, such as AdaFace and ArcFace. In addition, we introduce the Generative Privacy Funnel ($\mathsf{GenPF}$) model, a paradigm that extends beyond the traditional scope of the PF model, referred to as the Discriminative Privacy Funnel ($\mathsf{DisPF}$). This $\mathsf{GenPF}$ model brings new perspectives on data generation methods with estimation-theoretic and information-theoretic privacy guarantees. Complementing these developments, we also present the deep variational PF (DVPF) model. This model proposes a tractable variational bound for measuring information leakage, enhancing the understanding of privacy preservation challenges in deep representation learning. The DVPF model, associated with both $\mathsf{DisPF}$ and $\mathsf{GenPF}$ models, sheds light on connections with various generative models such as Variational Autoencoders (VAEs), Generative Adversarial Networks (GANs), and Diffusion models. Complementing our theoretical contributions, we release a reproducible PyTorch package, facilitating further exploration and application of these privacy-preserving methodologies in face recognition systems.
AdAM: Adaptive Fault-Tolerant Approximate Multiplier for Edge DNN Accelerators
Taheri, Mahdi, Cherezova, Natalia, Nazari, Samira, Rafiq, Ahsan, Azarpeyvand, Ali, Ghasempouri, Tara, Daneshtalab, Masoud, Raik, Jaan, Jenihhin, Maksim
The remainder of the paper is organized as follows. Section The role of Deep Neural Networks (DNNs) in a wide range II summarizes related works, the proposed method is presented of safety-and mission-critical applications (e.g., autonomous in Section III, Section IV provides the experimental setup and driving) is expanding. Therefore, deployment of a DNN accelerator discusses the results, and finally, the work is concluded in requires addressing the trade-off between different Section V. design parameters and reliability [1] [2].
Do as I can, not as I get: Topology-aware multi-hop reasoning on multi-modal knowledge graphs
Zheng, Shangfei, Yin, Hongzhi, Chen, Tong, Nguyen, Quoc Viet Hung, Chen, Wei, Zhao, Lei
Multi-modal knowledge graph (MKG) includes triplets that consist of entities and relations and multi-modal auxiliary data. In recent years, multi-hop multi-modal knowledge graph reasoning (MMKGR) based on reinforcement learning (RL) has received extensive attention because it addresses the intrinsic incompleteness of MKG in an interpretable manner. However, its performance is limited by empirically designed rewards and sparse relations. In addition, this method has been designed for the transductive setting where test entities have been seen during training, and it works poorly in the inductive setting where test entities do not appear in the training set. To overcome these issues, we propose TMR (Topology-aware Multi-hop Reasoning), which can conduct MKG reasoning under inductive and transductive settings. Specifically, TMR mainly consists of two components. (1) The topology-aware inductive representation captures information from the directed relations of unseen entities, and aggregates query-related topology features in an attentive manner to generate the fine-grained entity-independent features. (2) After completing multi-modal feature fusion, the relation-augment adaptive RL conducts multi-hop reasoning by eliminating manual rewards and dynamically adding actions. Finally, we construct new MKG datasets with different scales for inductive reasoning evaluation. Experimental results demonstrate that TMP outperforms state-of-the-art MKGR methods under both inductive and transductive settings.
Language Generation for Broad-Coverage, Explainable Cognitive Systems
This paper describes recent progress on natural language generation (NLG) for language-endowed intelligent agents (LEIAs) developed within the OntoAgent cognitive architecture. The approach draws heavily from past work on natural language understanding in this paradigm: it uses the same knowledge bases, theory of computational linguistics, agent architecture, and methodology of developing broad-coverage capabilities over time while still supporting near-term applications.
R2F: A Remote Retraining Framework for AIoT Processors with Computing Errors
Xu, Dawen, He, Meng, Liu, Cheng, Wang, Ying, Cheng, Long, Li, Huawei, Li, Xiaowei, Cheng, Kwang-Ting
AIoT processors fabricated with newer technology nodes suffer rising soft errors due to the shrinking transistor sizes and lower power supply. Soft errors on the AIoT processors particularly the deep learning accelerators (DLAs) with massive computing may cause substantial computing errors. These computing errors are difficult to be captured by the conventional training on general purposed processors like CPUs and GPUs in a server. Applying the offline trained neural network models to the edge accelerators with errors directly may lead to considerable prediction accuracy loss. To address the problem, we propose a remote retraining framework (R2F) for remote AIoT processors with computing errors. It takes the remote AIoT processor with soft errors in the training loop such that the on-site computing errors can be learned with the application data on the server and the retrained models can be resilient to the soft errors. Meanwhile, we propose an optimized partial TMR strategy to enhance the retraining. According to our experiments, R2F enables elastic design trade-offs between the model accuracy and the performance penalty. The top-5 model accuracy can be improved by 1.93%-13.73% with 0%-200% performance penalty at high fault error rate. In addition, we notice that the retraining requires massive data transmission and even dominates the training time, and propose a sparse increment compression approach for the data transmission optimization, which reduces the retraining time by 38%-88% on average with negligible accuracy loss over a straightforward remote retraining.
Robust Triple-Matrix-Recovery-Based Auto-Weighted Label Propagation for Classification
Zhang, Huan, Zhang, Zhao, Zhao, Mingbo, Ye, Qiaolin, Zhang, Min, Wang, Meng
The graph-based semi-supervised label propagation algorithm has delivered impressive classification results. However, the estimated soft labels typically contain mixed signs and noise, which cause inaccurate predictions due to the lack of suitable constraints. Moreover, available methods typically calculate the weights and estimate the labels in the original input space, which typically contains noise and corruption. Thus, the en-coded similarities and manifold smoothness may be inaccurate for label estimation. In this paper, we present effective schemes for resolving these issues and propose a novel and robust semi-supervised classification algorithm, namely, the tri-ple-matrix-recovery-based robust auto-weighted label propa-gation framework (ALP-TMR). Our ALP-TMR introduces a triple matrix recovery mechanism to remove noise or mixed signs from the estimated soft labels and improve the robustness to noise and outliers in the steps of assigning weights and pre-dicting the labels simultaneously. Our method can jointly re-cover the underlying clean data, clean labels and clean weighting spaces by decomposing the original data, predicted soft labels or weights into a clean part plus an error part by fitting noise. In addition, ALP-TMR integrates the au-to-weighting process by minimizing reconstruction errors over the recovered clean data and clean soft labels, which can en-code the weights more accurately to improve both data rep-resentation and classification. By classifying samples in the recovered clean label and weight spaces, one can potentially improve the label prediction results. The results of extensive experiments demonstrated the satisfactory performance of our ALP-TMR.