Bacciu, Davide
TEACHING -- Trustworthy autonomous cyber-physical applications through human-centred intelligence
Bacciu, Davide, Akarmazyan, Siranush, Armengaud, Eric, Bacco, Manlio, Bravos, George, Calandra, Calogero, Carlini, Emanuele, Carta, Antonio, Cassara, Pietro, Coppola, Massimo, Davalas, Charalampos, Dazzi, Patrizio, Degennaro, Maria Carmela, Di Sarli, Daniele, Dobaj, Jürgen, Gallicchio, Claudio, Girbal, Sylvain, Gotta, Alberto, Groppo, Riccardo, Lomonaco, Vincenzo, Macher, Georg, Mazzei, Daniele, Mencagli, Gabriele, Michail, Dimitrios, Micheli, Alessio, Peroglio, Roberta, Petroni, Salvatore, Potenza, Rosaria, Pourdanesh, Farank, Sardianos, Christos, Tserpes, Konstantinos, Tagliabò, Fulvio, Valtl, Jakob, Varlamis, Iraklis, Veledar, Omar
This paper discusses the perspective of the H2020 TEACHING project on the next generation of autonomous applications running in a distributed and highly heterogeneous environment comprising both virtual and physical resources spanning the edge-cloud continuum. TEACHING puts forward a human-centred vision leveraging the physiological, emotional, and cognitive state of the users as a driver for the adaptation and optimization of the autonomous applications. It does so by building a distributed, embedded and federated learning system complemented by methods and tools to enforce its dependability, security and privacy preservation. The paper discusses the main concepts of the TEACHING approach and singles out the main AI-related research challenges associated with it. Further, we provide a discussion of the design choices for the TEACHING system to tackle the aforementioned challenges
Continual Learning with Echo State Networks
Cossu, Andrea, Bacciu, Davide, Carta, Antonio, Gallicchio, Claudio, Lomonaco, Vincenzo
Continual Learning (CL) refers to a learning setup where data is non stationary and the model has to learn without forgetting existing knowledge. The study of CL for sequential patterns revolves around trained recurrent networks. In this work, instead, we introduce CL in the context of Echo State Networks (ESNs), where the recurrent component is kept fixed. We provide the first evaluation of catastrophic forgetting in ESNs and we highlight the benefits in using CL strategies which are not applicable to trained recurrent models. Our results confirm the ESN as a promising model for CL and open to its use in streaming scenarios.
Avalanche: an End-to-End Library for Continual Learning
Lomonaco, Vincenzo, Pellegrini, Lorenzo, Cossu, Andrea, Carta, Antonio, Graffieti, Gabriele, Hayes, Tyler L., De Lange, Matthias, Masana, Marc, Pomponi, Jary, van de Ven, Gido, Mundt, Martin, She, Qi, Cooper, Keiland, Forest, Jeremy, Belouadah, Eden, Calderara, Simone, Parisi, German I., Cuzzolin, Fabio, Tolias, Andreas, Scardapane, Simone, Antiga, Luca, Amhad, Subutai, Popescu, Adrian, Kanan, Christopher, van de Weijer, Joost, Tuytelaars, Tinne, Bacciu, Davide, Maltoni, Davide
Learning continually from non-stationary data streams is a long-standing goal and a challenging problem in machine learning. Recently, we have witnessed a renewed and fast-growing interest in continual learning, especially within the deep learning community. However, algorithmic solutions are often difficult to re-implement, evaluate and port across different settings, where even results on standard benchmarks are hard to reproduce. In this work, we propose Avalanche, an open-source end-to-end library for continual learning research based on PyTorch. Avalanche is designed to provide a shared and collaborative codebase for fast prototyping, training, and reproducible evaluation Figure 1: Operational representation of Avalanche with its of continual learning algorithms.
Distilled Replay: Overcoming Forgetting through Synthetic Samples
Rosasco, Andrea, Carta, Antonio, Cossu, Andrea, Lomonaco, Vincenzo, Bacciu, Davide
Replay strategies are Continual Learning techniques which mitigate catastrophic forgetting by keeping a buffer of patterns from previous experience, which are interleaved with new data during training. The amount of patterns stored in the buffer is a critical parameter which largely influences the final performance and the memory footprint of the approach. This work introduces Distilled Replay, a novel replay strategy for Continual Learning which is able to mitigate forgetting by keeping a very small buffer (up to $1$ pattern per class) of highly informative samples. Distilled Replay builds the buffer through a distillation process which compresses a large dataset into a tiny set of informative examples. We show the effectiveness of our Distilled Replay against naive replay, which randomly samples patterns from the dataset, on four popular Continual Learning benchmarks.
Continual Learning for Recurrent Neural Networks: an Empirical Evaluation
Cossu, Andrea, Carta, Antonio, Lomonaco, Vincenzo, Bacciu, Davide
Learning continuously during all model lifetime is fundamental to deploy machine learning solutions robust to drifts in the data distribution. Advances in Continual Learning (CL) with recurrent neural networks could pave the way to a large number of applications where incoming data is non stationary, like natural language processing and robotics. However, the existing body of work on the topic is still fragmented, with approaches which are application-specific and whose assessment is based on heterogeneous learning protocols and datasets. In this paper, we organize the literature on CL for sequential data processing by providing a categorization of the contributions and a review of the benchmarks. We propose two new benchmarks for CL with sequential data based on existing datasets, whose characteristics resemble real-world applications. We also provide a broad empirical evaluation of CL and Recurrent Neural Networks in class-incremental scenario, by testing their ability to mitigate forgetting with a number of different strategies which are not specific to sequential data processing. Our results highlight the key role played by the sequence length and the importance of a clear specification of the CL scenario.
Catastrophic Forgetting in Deep Graph Networks: an Introductory Benchmark for Graph Classification
Carta, Antonio, Cossu, Andrea, Errica, Federico, Bacciu, Davide
In this work, we study the phenomenon of catastrophic forgetting in Building a robust machine learning model that incrementally learns the graph representation learning scenario. The primary objective from different tasks without forgetting requires methodologies of the analysis is to understand whether classical continual learning that account for drifts in the input distribution. The Continual techniques for flat and sequential data have a tangible impact on Learning (CL) research field addresses the catastrophic forgetting performances when applied to graph data. To do so, we experiment problem [15, 16] by devising learning algorithms that improve a with a structure-agnostic model and a deep graph network in a model's ability to retain previously gathered information. As of robust and controlled environment on three different datasets. The today, CL methods have been studied from the perspective of flat benchmark is complemented by an investigation on the effect of data [24, 28, 39] and, to a lesser extent, sequential data [11, 40].
Graph Mixture Density Networks
Errica, Federico, Bacciu, Davide, Micheli, Alessio
We introduce the Graph Mixture Density Network, a new family of machine learning models that can fit multimodal output distributions conditioned on arbitrary input graphs. By combining ideas from mixture models and graph representation learning, we address a broad class of challenging regression problems that rely on structured data. Our main contribution is the design and evaluation of our method on large stochastic epidemic simulations conditioned on random graphs. We show that there is a significant improvement in the likelihood of an epidemic outcome when taking into account both multimodality and structure. In addition, we investigate how to \textit{implicitly} retain structural information in node representations by computing the distance between distributions of adjacent nodes, and the technique is tested on two structure reconstruction tasks with very good accuracy. Graph Mixture Density Networks open appealing research opportunities in the study of structure-dependent phenomena that exhibit non-trivial conditional output distributions.
Perplexity-free Parametric t-SNE
Crecchi, Francesco, de Bodt, Cyril, Verleysen, Michel, Lee, John A., Bacciu, Davide
The t-distributed Stochastic Neighbor Embedding (t-SNE) algorithm is a ubiquitously employed dimensionality reduction (DR) method. Its non-parametric nature and impressive efficacy motivated its parametric extension. It is however bounded to a user-defined perplexity parameter, restricting its DR quality compared to recently developed multi-scale perplexity-free approaches. This paper hence proposes a multi-scale parametric t-SNE scheme, relieved from the perplexity tuning and with a deep neural network implementing the mapping. It produces reliable embeddings with out-of-sample extensions, competitive with the best perplexity adjustments in terms of neighborhood preservation on multiple data sets.
ROS-Neuro Integration of Deep Convolutional Autoencoders for EEG Signal Compression in Real-time BCIs
Valenti, Andrea, Barsotti, Michele, Brondi, Raffaello, Bacciu, Davide, Ascari, Luca
Typical EEG-based BCI applications require the computation of complex functions over the noisy EEG channels to be carried out in an efficient way. Deep learning algorithms are capable of learning flexible nonlinear functions directly from data, and their constant processing latency is perfect for their deployment into online BCI systems. However, it is crucial for the jitter of the processing system to be as low as possible, in order to avoid unpredictable behaviour that can ruin the system's overall usability. In this paper, we present a novel encoding method, based on on deep convolutional autoencoders, that is able to perform efficient compression of the raw EEG inputs. We deploy our model in a ROS-Neuro node, thus making it suitable for the integration in ROS-based BCI and robotic systems in real world scenarios. The experimental results show that our system is capable to generate meaningful compressed encoding preserving to original information contained in the raw input. They also show that the ROS-Neuro node is able to produce such encodings at a steady rate, with minimal jitter. We believe that our system can represent an important step towards the development of an effective BCI processing pipeline fully standardized in ROS-Neuro framework.
Tensor Decompositions in Recursive Neural Networks for Tree-Structured Data
Castellana, Daniele, Bacciu, Davide
The paper introduces two new aggregation functions to encode structural knowledge from tree-structured data. They leverage the Canonical and Tensor-Train decompositions to yield expressive context aggregation while limiting the number of model parameters. Finally, we define two novel neural recursive models for trees leveraging such aggregation functions, and we test them on two tree classification tasks, showing the advantage of proposed models when tree outdegree increases.