Pezze, Davide Dalle
From Vision to Sound: Advancing Audio Anomaly Detection with Vision-Based Algorithms
Barusco, Manuel, Borsatti, Francesco, Pezze, Davide Dalle, Paissan, Francesco, Farella, Elisabetta, Susto, Gian Antonio
A BSTRACT Recent advances in Visual Anomaly Detection (V AD) have introduced sophisticated algorithms leveraging embeddings generated by pre-trained feature extractors. Inspired by these developments, we investigate the adaptation of such algorithms to the audio domain to address the problem of Audio Anomaly Detection (AAD). Unlike most existing AAD methods, which primarily classify anomalous samples, our approach introduces fine-grained temporal-frequency localization of anomalies within the spectrogram, significantly improving explainability. This capability enables a more precise understanding of where and when anomalies occur, making the results more actionable for end users. We evaluate our approach on industrial and environmental benchmarks, demonstrating the effectiveness of V AD techniques in detecting anomalies in audio signals. Moreover, they improve explainability by enabling localized anomaly identification, making audio anomaly detection systems more interpretable and practical.Keywords Anomaly Detection Audio Pre-training Embedding 1 Introduction Audio anomaly detection (AAD) is the task of detecting unexpected or out-of-distribution sounds in audio sequences.
Continual Learning for Behavior-based Driver Identification
Fanan, Mattia, Pezze, Davide Dalle, Efatinasab, Emad, Carli, Ruggero, Rampazzo, Mirco, Susto, Gian Antonio
Behavior-based Driver Identification is an emerging technology that recognizes drivers based on their unique driving behaviors, offering important applications such as vehicle theft prevention and personalized driving experiences. However, most studies fail to account for the real-world challenges of deploying Deep Learning models within vehicles. These challenges include operating under limited computational resources, adapting to new drivers, and changes in driving behavior over time. The objective of this study is to evaluate if Continual Learning (CL) is well-suited to address these challenges, as it enables models to retain previously learned knowledge while continually adapting with minimal computational overhead and resource requirements. We tested several CL techniques across three scenarios of increasing complexity based on the well-known OCSLab dataset. This work provides an important step forward in scalable driver identification solutions, demonstrating that CL approaches, such as DER, can obtain strong performance, with only an 11% reduction in accuracy compared to the static scenario. Furthermore, to enhance the performance, we propose two new methods, SmooER and SmooDER, that leverage the temporal continuity of driver identity over time to enhance classification accuracy. Our novel method, SmooDER, achieves optimal results with only a 2% reduction compared to the 11\% of the DER approach. In conclusion, this study proves the feasibility of CL approaches to address the challenges of Driver Identification in dynamic environments, making them suitable for deployment on cloud infrastructure or directly within vehicles.
PaSTe: Improving the Efficiency of Visual Anomaly Detection at the Edge
Barusco, Manuel, Borsatti, Francesco, Pezze, Davide Dalle, Paissan, Francesco, Farella, Elisabetta, Susto, Gian Antonio
Visual Anomaly Detection (VAD) has gained significant research attention for its ability to identify anomalous images and pinpoint the specific areas responsible for the anomaly. A key advantage of VAD is its unsupervised nature, which eliminates the need for costly and time-consuming labeled data collection. However, despite its potential for real-world applications, the literature has given limited focus to resource-efficient VAD, particularly for deployment on edge devices. This work addresses this gap by leveraging lightweight neural networks to reduce memory and computation requirements, enabling VAD deployment on resource-constrained edge devices. We benchmark the major VAD algorithms within this framework and demonstrate the feasibility of edge-based VAD using the well-known MVTec dataset. Furthermore, we introduce a novel algorithm, Partially Shared Teacher-student (PaSTe), designed to address the high resource demands of the existing Student Teacher Feature Pyramid Matching (STFPM) approach. Our results show that PaSTe decreases the inference time by 25%, while reducing the training time by 33% and peak RAM usage during training by 76%. These improvements make the VAD process significantly more efficient, laying a solid foundation for real-world deployment on edge devices.
Tiny Robotics Dataset and Benchmark for Continual Object Detection
Pasti, Francesco, De Monte, Riccardo, Pezze, Davide Dalle, Susto, Gian Antonio, Bellotto, Nicola
Detecting objects in mobile robotics is crucial for numerous applications, from autonomous navigation to inspection. However, robots are often required to perform tasks in different domains with respect to the training one and need to adapt to these changes. Tiny mobile robots, subject to size, power, and computational constraints, encounter even more difficulties in running and adapting these algorithms. Such adaptability, though, is crucial for real-world deployment, where robots must operate effectively in dynamic and unpredictable settings. In this work, we introduce a novel benchmark to evaluate the continual learning capabilities of object detection systems in tiny robotic platforms. Our contributions include: (i) Tiny Robotics Object Detection (TiROD), a comprehensive dataset collected using a small mobile robot, designed to test the adaptability of object detectors across various domains and classes; (ii) an evaluation of state-of-the-art real-time object detectors combined with different continual learning strategies on this dataset, providing detailed insights into their performance and limitations; and (iii) we publish the data and the code to replicate the results to foster continuous advancements in this field. Our benchmark results indicate key challenges that must be addressed to advance the development of robust and efficient object detection systems for tiny robotics.
Multi-Label Continual Learning for the Medical Domain: A Novel Benchmark
Ceccon, Marina, Pezze, Davide Dalle, Fabris, Alessandro, Susto, Gian Antonio
Multi-label image classification in dynamic environments is a problem that poses significant challenges. Previous studies have primarily focused on scenarios such as Domain Incremental Learning and Class Incremental Learning, which do not fully capture the complexity of real-world applications. In this paper, we study the problem of classification of medical imaging in the scenario termed New Instances and New Classes, which combines the challenges of both new class arrivals and domain shifts in a single framework. Unlike traditional scenarios, it reflects the realistic nature of CL in domains such as medical imaging, where updates may introduce both new classes and changes in domain characteristics. To address the unique challenges posed by this complex scenario, we introduce a novel approach called Pseudo-Label Replay. This method aims to mitigate forgetting while adapting to new classes and domain shifts by combining the advantages of the Replay and Pseudo-Label methods and solving their limitations in the proposed scenario. We evaluate our proposed approach on a challenging benchmark consisting of two datasets, seven tasks, and nineteen classes, modeling a realistic Continual Learning scenario. Our experimental findings demonstrate the effectiveness of Pseudo-Label Replay in addressing the challenges posed by the complex scenario proposed. Our method surpasses existing approaches, exhibiting superior performance while showing minimal forgetting.
Fairness Evolution in Continual Learning for Medical Imaging
Ceccon, Marina, Pezze, Davide Dalle, Fabris, Alessandro, Susto, Gian Antonio
Deep Learning (DL) has made significant strides in various medical applications in recent years, achieving remarkable results. In the field of medical imaging, DL models can assist doctors in disease diagnosis by classifying pathologies in Chest X-ray images. However, training on new data to expand model capabilities and adapt to distribution shifts is a notable challenge these models face. Continual Learning (CL) has emerged as a solution to this challenge, enabling models to adapt to new data while retaining knowledge gained from previous experiences. Previous studies have analyzed the behavior of CL strategies in medical imaging regarding classification performance. However, when considering models that interact with sensitive information, such as in the medical domain, it is imperative to disaggregate the performance of socially salient groups. Indeed, DL algorithms can exhibit biases against certain sub-populations, leading to discrepancies in predictive performance across different groups identified by sensitive attributes such as age, race/ethnicity, sex/gender, and socioeconomic status. In this study, we go beyond the typical assessment of classification performance in CL and study bias evolution over successive tasks with domain-specific fairness metrics. Specifically, we evaluate the CL strategies using the well-known CheXpert (CXP) and ChestX-ray14 (NIH) datasets. We consider a class incremental scenario of five tasks with 12 pathologies. We evaluate the Replay, Learning without Forgetting (LwF), LwF Replay, and Pseudo-Label strategies. LwF and Pseudo-Label exhibit optimal classification performance, but when including fairness metrics in the evaluation, it is clear that Pseudo-Label is less biased. For this reason, this strategy should be preferred when considering real-world scenarios in which it is crucial to consider the fairness of the model.
Bayesian Deep Learning for Remaining Useful Life Estimation via Stein Variational Gradient Descent
Della Libera, Luca, Andreoli, Jacopo, Pezze, Davide Dalle, Ravanelli, Mirco, Susto, Gian Antonio
A crucial task in predictive maintenance is estimating the remaining useful life of physical systems. In the last decade, deep learning has improved considerably upon traditional model-based and statistical approaches in terms of predictive performance. However, in order to optimally plan maintenance operations, it is also important to quantify the uncertainty inherent to the predictions. This issue can be addressed by turning standard frequentist neural networks into Bayesian neural networks, which are naturally capable of providing confidence intervals around the estimates. Several methods exist for training those models. Researchers have focused mostly on parametric variational inference and sampling-based techniques, which notoriously suffer from limited approximation power and large computational burden, respectively. In this work, we use Stein variational gradient descent, a recently proposed algorithm for approximating intractable distributions that overcomes the drawbacks of the aforementioned techniques. In particular, we show through experimental studies on simulated run-to-failure turbofan engine degradation data that Bayesian deep learning models trained via Stein variational gradient descent consistently outperform with respect to convergence speed and predictive performance both the same models trained via parametric variational inference and their frequentist counterparts trained via backpropagation. Furthermore, we propose a method to enhance performance based on the uncertainty information provided by the Bayesian models. We release the source code at https://github.com/lucadellalib/bdl-rul-svgd.