AITopics

2107.06543

Country: Europe > Italy (0.47)

Genre: Research Report (1.00)

Industry:

Health & Medicine (1.00)
Information Technology > Security & Privacy (0.67)

Technology:

Information Technology > Communications (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.96)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.94)

arXiv.org Artificial IntelligenceMay-17-2021

Continual Learning with Echo State Networks

Cossu, Andrea, Bacciu, Davide, Carta, Antonio, Gallicchio, Claudio, Lomonaco, Vincenzo

Continual Learning (CL) refers to a learning setup where data is non stationary and the model has to learn without forgetting existing knowledge. The study of CL for sequential patterns revolves around trained recurrent networks. In this work, instead, we introduce CL in the context of Echo State Networks (ESNs), where the recurrent component is kept fixed. We provide the first evaluation of catastrophic forgetting in ESNs and we highlight the benefits in using CL strategies which are not applicable to trained recurrent models. Our results confirm the ESN as a promising model for CL and open to its use in streaming scenarios.

continual learning, deep learning, neural network, (16 more...)

2105.07674

Country: Europe > Italy (0.15)

Genre: Research Report > New Finding (0.87)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.41)

arXiv.org Artificial IntelligenceApr-1-2021

Avalanche: an End-to-End Library for Continual Learning

Lomonaco, Vincenzo, Pellegrini, Lorenzo, Cossu, Andrea, Carta, Antonio, Graffieti, Gabriele, Hayes, Tyler L., De Lange, Matthias, Masana, Marc, Pomponi, Jary, van de Ven, Gido, Mundt, Martin, She, Qi, Cooper, Keiland, Forest, Jeremy, Belouadah, Eden, Calderara, Simone, Parisi, German I., Cuzzolin, Fabio, Tolias, Andreas, Scardapane, Simone, Antiga, Luca, Amhad, Subutai, Popescu, Adrian, Kanan, Christopher, van de Weijer, Joost, Tuytelaars, Tinne, Bacciu, Davide, Maltoni, Davide

Learning continually from non-stationary data streams is a long-standing goal and a challenging problem in machine learning. Recently, we have witnessed a renewed and fast-growing interest in continual learning, especially within the deep learning community. However, algorithmic solutions are often difficult to re-implement, evaluate and port across different settings, where even results on standard benchmarks are hard to reproduce. In this work, we propose Avalanche, an open-source end-to-end library for continual learning research based on PyTorch. Avalanche is designed to provide a shared and collaborative codebase for fast prototyping, training, and reproducible evaluation Figure 1: Operational representation of Avalanche with its of continual learning algorithms.

avalanche, deep learning, neural network, (16 more...)

2104.00405

Country:

Europe > Sweden (0.28)
North America > United States (0.28)

Genre:

Instructional Material (0.46)
Research Report (0.40)

Industry: Education (0.68)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

arXiv.org Artificial IntelligenceMar-29-2021

Distilled Replay: Overcoming Forgetting through Synthetic Samples

Rosasco, Andrea, Carta, Antonio, Cossu, Andrea, Lomonaco, Vincenzo, Bacciu, Davide

Replay strategies are Continual Learning techniques which mitigate catastrophic forgetting by keeping a buffer of patterns from previous experience, which are interleaved with new data during training. The amount of patterns stored in the buffer is a critical parameter which largely influences the final performance and the memory footprint of the approach. This work introduces Distilled Replay, a novel replay strategy for Continual Learning which is able to mitigate forgetting by keeping a very small buffer (up to $1$ pattern per class) of highly informative samples. Distilled Replay builds the buffer through a distillation process which compresses a large dataset into a tiny set of informative examples. We show the effectiveness of our Distilled Replay against naive replay, which randomly samples patterns from the dataset, on four popular Continual Learning benchmarks.

deep learning, distilled replay, neural network, (18 more...)

2103.15851

Country:

Europe > Italy (0.14)
North America > Canada > Ontario > Toronto (0.14)

Genre: Research Report (0.82)

Industry: Education (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

arXiv.org Artificial IntelligenceMar-24-2021

Continual Learning for Recurrent Neural Networks: an Empirical Evaluation

Cossu, Andrea, Carta, Antonio, Lomonaco, Vincenzo, Bacciu, Davide

Learning continuously during all model lifetime is fundamental to deploy machine learning solutions robust to drifts in the data distribution. Advances in Continual Learning (CL) with recurrent neural networks could pave the way to a large number of applications where incoming data is non stationary, like natural language processing and robotics. However, the existing body of work on the topic is still fragmented, with approaches which are application-specific and whose assessment is based on heterogeneous learning protocols and datasets. In this paper, we organize the literature on CL for sequential data processing by providing a categorization of the contributions and a review of the benchmarks. We propose two new benchmarks for CL with sequential data based on existing datasets, whose characteristics resemble real-world applications. We also provide a broad empirical evaluation of CL and Recurrent Neural Networks in class-incremental scenario, by testing their ability to mitigate forgetting with a number of different strategies which are not specific to sequential data processing. Our results highlight the key role played by the sequence length and the importance of a clear specification of the CL scenario.

deep learning, neural network, survey article, (20 more...)

2103.07492

Country:

Europe (1.00)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)

Genre: Research Report > New Finding (1.00)

Industry:

Health & Medicine (1.00)
Education (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

arXiv.org Artificial IntelligenceMar-22-2021

Catastrophic Forgetting in Deep Graph Networks: an Introductory Benchmark for Graph Classification

Carta, Antonio, Cossu, Andrea, Errica, Federico, Bacciu, Davide

In this work, we study the phenomenon of catastrophic forgetting in Building a robust machine learning model that incrementally learns the graph representation learning scenario. The primary objective from different tasks without forgetting requires methodologies of the analysis is to understand whether classical continual learning that account for drifts in the input distribution. The Continual techniques for flat and sequential data have a tangible impact on Learning (CL) research field addresses the catastrophic forgetting performances when applied to graph data. To do so, we experiment problem [15, 16] by devising learning algorithms that improve a with a structure-agnostic model and a deep graph network in a model's ability to retain previously gathered information. As of robust and controlled environment on three different datasets. The today, CL methods have been studied from the perspective of flat benchmark is complemented by an investigation on the effect of data [24, 28, 39] and, to a lesser extent, sequential data [11, 40].

artificial intelligence, learning, neural network, (15 more...)

2103.1175

Country:

Europe > Italy (0.15)
North America > United States (0.14)

Genre: Research Report > New Finding (0.69)

Industry: Education > Educational Setting (0.47)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

arXiv.org Machine LearningDec-5-2020

Graph Mixture Density Networks

Errica, Federico, Bacciu, Davide, Micheli, Alessio

We introduce the Graph Mixture Density Network, a new family of machine learning models that can fit multimodal output distributions conditioned on arbitrary input graphs. By combining ideas from mixture models and graph representation learning, we address a broad class of challenging regression problems that rely on structured data. Our main contribution is the design and evaluation of our method on large stochastic epidemic simulations conditioned on random graphs. We show that there is a significant improvement in the likelihood of an epidemic outcome when taking into account both multimodality and structure. In addition, we investigate how to \textit{implicitly} retain structural information in node representations by computing the distance between distributions of adjacent nodes, and the technique is tested on two structure reconstruction tasks with very good accuracy. Graph Mixture Density Networks open appealing research opportunities in the study of structure-dependent phenomena that exhibit non-trivial conditional output distributions.

bayesian inference, information, neural network, (20 more...)

2012.03085

Country: Europe > Italy (0.14)

Genre: Research Report > New Finding (0.68)

Industry: Health & Medicine > Epidemiology (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

arXiv.org Machine LearningOct-3-2020

Perplexity-free Parametric t-SNE

Crecchi, Francesco, de Bodt, Cyril, Verleysen, Michel, Lee, John A., Bacciu, Davide

The t-distributed Stochastic Neighbor Embedding (t-SNE) algorithm is a ubiquitously employed dimensionality reduction (DR) method. Its non-parametric nature and impressive efficacy motivated its parametric extension. It is however bounded to a user-defined perplexity parameter, restricting its DR quality compared to recently developed multi-scale perplexity-free approaches. This paper hence proposes a multi-scale parametric t-SNE scheme, relieved from the perplexity tuning and with a deep neural network implementing the mapping. It produces reliable embeddings with out-of-sample extensions, competitive with the best perplexity adjustments in terms of neighborhood preservation on multiple data sets.

deep learning, neural network, similarity, (17 more...)

2010.01359

Country:

Europe > Italy (0.14)
Europe > Belgium (0.14)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.34)

arXiv.org Machine LearningAug-31-2020

ROS-Neuro Integration of Deep Convolutional Autoencoders for EEG Signal Compression in Real-time BCIs

Valenti, Andrea, Barsotti, Michele, Brondi, Raffaello, Bacciu, Davide, Ascari, Luca

Typical EEG-based BCI applications require the computation of complex functions over the noisy EEG channels to be carried out in an efficient way. Deep learning algorithms are capable of learning flexible nonlinear functions directly from data, and their constant processing latency is perfect for their deployment into online BCI systems. However, it is crucial for the jitter of the processing system to be as low as possible, in order to avoid unpredictable behaviour that can ruin the system's overall usability. In this paper, we present a novel encoding method, based on on deep convolutional autoencoders, that is able to perform efficient compression of the raw EEG inputs. We deploy our model in a ROS-Neuro node, thus making it suitable for the integration in ROS-based BCI and robotic systems in real world scenarios. The experimental results show that our system is capable to generate meaningful compressed encoding preserving to original information contained in the raw input. They also show that the ROS-Neuro node is able to produce such encodings at a steady rate, with minimal jitter. We believe that our system can represent an important step towards the development of an effective BCI processing pipeline fully standardized in ROS-Neuro framework.

deep learning, neural network, representation, (20 more...)

2008.13485

Country:

Europe > Italy (0.16)
North America > United States (0.14)

Genre: Research Report > New Finding (0.34)

Industry: Health & Medicine > Therapeutic Area (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

arXiv.org Machine LearningAug-13-2020

Tensor Decompositions in Recursive Neural Networks for Tree-Structured Data

Castellana, Daniele, Bacciu, Davide

The paper introduces two new aggregation functions to encode structural knowledge from tree-structured data. They leverage the Canonical and Tensor-Train decompositions to yield expressive context aggregation while limiting the number of model parameters. Finally, we define two novel neural recursive models for trees leveraging such aggregation functions, and we test them on two tree classification tasks, showing the advantage of proposed models when tree outdegree increases.

aggregation function, deep learning, neural network, (15 more...)

2006.10619

Country: Europe > Italy (0.14)

Genre: Research Report (0.82)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)