AITopics

2109.09426

Country: Europe > Italy (0.14)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

arXiv.org Artificial IntelligenceMay-6-2021

Structured Ensembles: an Approach to Reduce the Memory Footprint of Ensemble Methods

Pomponi, Jary, Scardapane, Simone, Uncini, Aurelio

In this paper, we propose a novel ensembling technique for deep neural networks, which is able to drastically reduce the required memory compared to alternative approaches. In particular, we propose to extract multiple sub-networks from a single, untrained neural network by solving an end-to-end optimization task combining differentiable scaling over the original architecture, with multiple regularization terms favouring the diversity of the ensemble. Since our proposal aims to detect and extract sub-structures, we call it Structured Ensemble. On a large experimental evaluation, we show that our method can achieve higher or comparable accuracy to competing methods while requiring significantly less storage. In addition, we evaluate our ensembles in terms of predictive calibration and uncertainty, showing they compare favourably with the state-of-the-art. Finally, we draw a link with the continual learning literature, and we propose a modification of our framework to handle continuous streams of tasks with a sub-linear memory cost. We compare with a number of alternative strategies to mitigate catastrophic forgetting, highlighting advantages in terms of average accuracy and memory.

deep learning, ensemble, neural network, (15 more...)

2105.02551

Country: Europe > Italy (0.14)

Genre: Research Report > New Finding (0.46)

Industry: Education > Curriculum (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.66)

arXiv.org Machine LearningApr-29-2021

Biased Edge Dropout for Enhancing Fairness in Graph Representation Learning

Spinelli, Indro, Scardapane, Simone, Hussain, Amir, Uncini, Aurelio

Graph representation learning has become a ubiquitous component in many scenarios, ranging from social network analysis to energy forecasting in smart grids. In several applications, ensuring the fairness of the node (or graph) representations with respect to some protected attributes is crucial for their correct deployment. Yet, fairness in graph deep learning remains under-explored, with few solutions available. In particular, the tendency of similar nodes to cluster on several real-world graphs (i.e., homophily) can dramatically worsen the fairness of these procedures. In this paper, we propose a biased edge dropout algorithm (FairDrop) to counter-act homophily and improve fairness in graph representation learning. FairDrop can be plugged in easily on many existing algorithms, is efficient, adaptable, and can be combined with other fairness-inducing solutions. After describing the general algorithm, we demonstrate its application on two benchmark tasks, specifically, as a random walk model for producing node embeddings, and to a graph convolutional network for link prediction. We prove that the proposed algorithm can successfully improve the fairness of all models up to a small or negligible drop in accuracy, and compares favourably with existing state-of-the-art solutions. In an ablation study, we demonstrate that our algorithm can flexibly interpolate between biasing towards fairness and an unbiased edge dropout. Furthermore, to better evaluate the gains, we propose a new dyadic group definition to measure the bias of a link prediction task when paired with group-based fairness metrics. In particular, we extend the metric used to measure the bias in the node embeddings to take into account the graph structure.

deep learning, fairness, neural network, (20 more...)

2104.1421

Country:

Europe > Italy (0.46)
North America > United States > New York (0.14)

Genre: Research Report (0.84)

Industry:

Energy > Power Industry (0.48)
Information Technology > Security & Privacy (0.46)

Technology:

Information Technology > Information Management (1.00)
Information Technology > Data Science > Data Mining (1.00)
Information Technology > Communications > Social Media (1.00)
(4 more...)

arXiv.org Artificial IntelligenceApr-29-2021

L3DAS21 Challenge: Machine Learning for 3D Audio Signal Processing

Guizzo, Eric, Gramaccioni, Riccardo F., Jamili, Saeid, Marinoni, Christian, Massaro, Edoardo, Medaglia, Claudia, Nachira, Giuseppe, Nucciarelli, Leonardo, Paglialunga, Ludovica, Pennese, Marco, Pepe, Sveva, Rocchi, Enrico, Uncini, Aurelio, Comminiello, Danilo

The L3DAS21 Challenge is aimed at encouraging and fostering collaborative research on machine learning for 3D audio signal processing, with particular focus on 3D speech enhancement (SE) and 3D sound localization and detection (SELD). Alongside with the challenge, we release the L3DAS21 dataset, a 65 hours 3D audio corpus, accompanied with a Python API that facilitates the data usage and results submission stage. Usually, machine learning approaches to 3D audio tasks are based on single-perspective Ambisonics recordings or on arrays of single-capsule microphones. We propose, instead, a novel multichannel audio configuration based multiple-source and multiple-perspective Ambisonics recordings, performed with an array of two first-order Ambisonics microphones. To the best of our knowledge, it is the first time that a dual-mic Ambisonics configuration is used for these tasks. We provide baseline models and results for both tasks, obtained with state-of-the-art architectures: FaSNet for SE and SELDNet for SELD. This report is aimed at providing all needed information to participate in the L3DAS21 Challenge, illustrating the details of the L3DAS21 dataset, the challenge tasks and the baseline models.

artificial intelligence, deep learning, machine learning, (16 more...)

doi: 10.1109/MLSP52302.2021.9596248

2104.05499

Country:

North America > United States (0.47)
Europe > Italy (0.28)

Genre: Research Report (0.64)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.95)

arXiv.org Artificial IntelligenceApr-22-2021

A Quaternion-Valued Variational Autoencoder

Grassucci, Eleonora, Comminiello, Danilo, Uncini, Aurelio

Deep probabilistic generative models have achieved incredible success in many fields of application. Among such models, variational autoencoders (VAEs) have proved their ability in modeling a generative process by learning a latent representation of the input. In this paper, we propose a novel VAE defined in the quaternion domain, which exploits the properties of quaternion algebra to improve performance while significantly reducing the number of parameters required by the network. The success of the proposed quaternion VAE with respect to traditional VAEs relies on the ability to leverage the internal relations between quaternion-valued input features and on the properties of second-order statistics which allow to define the latent variables in the augmented quaternion domain. In order to show the advantages due to such properties, we define a plain convolutional VAE in the quaternion domain and we evaluate its performance with respect to its real-valued counterpart on the CelebA face dataset.

artificial intelligence, machine learning, signal process, (16 more...)

doi: 10.1109/ICASSP39728.2021.9413859

2010.11647

Country: Europe > Austria (0.28)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)

arXiv.org Machine LearningNov-12-2020

Pseudo-Rehearsal for Continual Learning with Normalizing Flows

Pomponi, Jary, Scardapane, Simone, Uncini, Aurelio

Catastrophic forgetting (CF) happens whenever a neural network overwrites past knowledge while being trained on new tasks. Common techniques to handle CF include regularization of the weights (using, e.g., their importance on past tasks), and rehearsal strategies, where the network is constantly re-trained on past data. Generative models have also been applied for the latter, in order to have endless sources of data. In this paper, we propose a novel method that combines the strengths of regularization and generative-based rehearsal approaches. Our generative model consists of a normalizing flow (NF), a probabilistic and invertible neural network, trained on the internal embeddings of the network. By keeping a single NF conditioned on the task, we show that our memory overhead remains constant. In addition, exploiting the invertibility of the NF, we propose a simple approach to regularize the network's embeddings with respect to past tasks. We show that our method performs favorably with respect to state-of-the-art approaches in the literature, with bounded computational power and memory overheads.

artificial intelligence, generative model, neural network, (16 more...)

2007.02443

Country: Europe > Italy (0.14)

Genre: Research Report > Promising Solution (0.68)

Industry: Education > Educational Setting (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

arXiv.org Machine LearningJun-23-2020

Why should we add early exits to neural networks?

Scardapane, Simone, Scarpiniti, Michele, Baccarelli, Enzo, Uncini, Aurelio

Deep neural networks are generally designed as a stack of differentiable layers, in which a prediction is obtained only after running the full stack. Recently, some contributions have proposed techniques to endow the networks with early exits, allowing to obtain predictions at intermediate points of the stack. These multi-output networks have a number of advantages, including: (i) significant reductions of the inference time, (ii) reduced tendency to overfitting and vanishing gradients, and (iii) capability of being distributed over multi-tier computation platforms. In addition, they connect to the wider themes of biological plausibility and layered cognitive reasoning. In this paper, we provide a comprehensive introduction to this family of neural networks, by describing in a unified fashion the way these architectures can be designed, trained, and actually deployed in time-constrained scenarios. We also describe in-depth their application scenarios in 5G and Fog computing environments, as long as some of the open research questions connected to them.

deep learning, neural network, survey article, (16 more...)

doi: 10.1007/s12559-020-09734-4

2004.12814

Country: Europe > Italy (0.14)

Genre: Research Report (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

arXiv.org Artificial IntelligenceFeb-24-2020

A Multimodal Deep Network for the Reconstruction of T2W MR Images

Falvo, Antonio, Comminiello, Danilo, Scardapane, Simone, Scarpiniti, Michele, Uncini, Aurelio

Multiple sclerosis is one of the most common chronic neurological diseases affecting the central nervous system. Lesions produced by the MS can be observed through two modalities of magnetic resonance (MR), known as T2W and FLAIR sequences, both providing useful information for formulating a diagnosis. However, long acquisition time makes the acquired MR image vulnerable to motion artifacts. This leads to the need of accelerating the execution of the MR analysis. In this paper, we present a deep learning method that is able to reconstruct subsampled MR images obtained by reducing the k-space data, while maintaining a high image quality that can be used to observe brain lesions. The proposed method exploits the multimodal approach of neural networks and it also focuses on the data acquisition and processing stages to reduce execution time of the MR analysis. Results prove the effectiveness of the proposed method in reconstructing subsampled MR images while saving execution time.

artificial intelligence, machine learning, reconstruction, (17 more...)

doi: 10.1007/978-981-15-5093-5_38

1908.03009

Country: North America (0.47)

Genre: Research Report > New Finding (0.34)

Industry:

Health & Medicine > Therapeutic Area > Neurology (1.00)
Health & Medicine > Diagnostic Medicine > Imaging (0.96)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

arXiv.org Machine LearningFeb-24-2020

Adaptive Propagation Graph Convolutional Network

Spinelli, Indro, Scardapane, Simone, Uncini, Aurelio

Graph convolutional networks (GCNs) are a family of neural network models that perform inference on graph data by interleaving vertex-wise operations and message-passing exchanges across nodes. Concerning the latter, two key questions arise: (i) how to design a differentiable exchange protocol (e.g., a 1-hop Laplacian smoothing in the original GCN), and (ii) how to characterize the trade-off in complexity with respect to the local updates. In this paper, we show that state-of-the-art results can be achieved by adapting the number of communication steps independently at every node. In particular, we endow each node with a halting unit (inspired by Graves' adaptive computation time) that after every exchange decides whether to continue communicating or not. We show that the proposed adaptive propagation GCN (AP-GCN) achieves superior or similar results to the best proposed models so far on a number of benchmarks, while requiring a small overhead in terms of additional parameters. We also investigate a regularization term to enforce an explicit trade-off between communication and accuracy. The code for the AP-GCN experiments is released as an open-source library.

deep learning, graph, neural network, (15 more...)

2002.10306

Country: Europe > Italy (0.14)

Genre: Research Report (0.65)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.95)

arXiv.org Machine LearningSep-9-2019

Efficient Continual Learning in Neural Networks with Embedding Regularization

Pomponi, Jary, Scardapane, Simone, Lomonaco, Vincenzo, Uncini, Aurelio

Continual learning of deep neural networks is a key requirement for scaling them up to more complex applicative scenarios and for achieving real lifelong learning of these architectures. Previous approaches to the problem have considered either the progressive increase in the size of the networks, or have tried to regularize the network behavior to equalize it with respect to previously observed tasks. In the latter case, it is essential to understand what type of information best represents this past behavior. Common techniques include regularizing the past outputs, gradients, or individual weights. In this work, we propose a new, relatively simple and efficient method to perform continual learning by regularizing instead the network internal embeddings. To make the approach scalable, we also propose a dynamic sampling strategy to reduce the memory footprint of the required external storage. We show that our method performs favorably with respect to state-of-the-art approaches in the literature, while requiring significantly less space in memory and computational time. In addition, inspired inspired by to recent works, we evaluate the impact of selecting a more flexible model for the activation functions inside the network, evaluating the impact of catastrophic forgetting on the activation functions themselves.

continual learning, deep learning, neural network, (20 more...)

1909.03742

Country: Europe > Italy (0.28)

Genre: Research Report > Promising Solution (0.88)

Industry: Telecommunications (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)