AITopics

2412.19587

Country: Europe (0.69)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

arXiv.org Artificial IntelligenceApr-25-2024

Adaptive Semantic Token Selection for AI-native Goal-oriented Communications

Devoto, Alessio, Petruzzi, Simone, Pomponi, Jary, Di Lorenzo, Paolo, Scardapane, Simone

In this paper, we propose a novel design for AI-native goal-oriented communications, exploiting transformer neural networks under dynamic inference constraints on bandwidth and computation. Transformers have become the standard architecture for pretraining large-scale vision and text models, and preliminary results have shown promising performance also in deep joint source-channel coding (JSCC). Here, we consider a dynamic model where communication happens over a channel with variable latency and bandwidth constraints. Leveraging recent works on conditional computation, we exploit the structure of the transformer blocks and the multihead attention operator to design a trainable semantic token selection mechanism that learns to select relevant tokens (e.g., image patches) from the input signal. This is done dynamically, on a per-input basis, with a rate that can be chosen as an additional input by the user. We show that our model improves over state-of-the-art token selection mechanisms, exhibiting high accuracy for a wide range of latency and bandwidth constraints, without the need for deploying multiple architectures tailored to each constraint. Last, but not least, the proposed token selection mechanism helps extract powerful semantics that are easy to understand and explain, paving the way for interpretable-by-design models for the next generation of AI-native communication systems.

artificial intelligence, budget, machine learning, (18 more...)

2405.0233

Country: Europe > Italy (0.15)

Genre: Research Report (0.64)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)

arXiv.org Artificial IntelligenceFeb-5-2024

Cascaded Scaling Classifier: class incremental learning with probability scaling

Pomponi, Jary, Devoto, Alessio, Scardapane, Simone

Humans are capable of acquiring new knowledge and transferring learned knowledge into different domains, incurring a small forgetting. The same ability, called Continual Learning, is challenging to achieve when operating with neural networks due to the forgetting affecting past learned tasks when learning new ones. This forgetting can be mitigated by replaying stored samples from past tasks, but a large memory size may be needed for long sequences of tasks; moreover, this could lead to overfitting on saved samples. In this paper, we propose a novel regularisation approach and a novel incremental classifier called, respectively, Margin Dampening and Cascaded Scaling Classifier. The first combines a soft constraint and a knowledge distillation approach to preserve past learned knowledge while allowing the model to learn new patterns effectively. The latter is a gated incremental classifier, helping the model modify past predictions without directly interfering with them. This is achieved by modifying the output of the model with auxiliary scaling functions. We empirically show that our approach performs well on multiple benchmarks against well-established baselines, and we also study each component of our proposal and how the combinations of such components affect the final results.

artificial intelligence, learning, machine learning, (15 more...)

2402.01262

Country: Europe > Italy (0.14)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.34)
Information Technology > Artificial Intelligence > Representation & Reasoning > Constraint-Based Reasoning (0.34)

arXiv.org Artificial IntelligenceJan-24-2024

NACHOS: Neural Architecture Search for Hardware Constrained Early Exit Neural Networks

Gambella, Matteo, Pomponi, Jary, Scardapane, Simone, Roveri, Manuel

Early Exit Neural Networks (EENNs) endow astandard Deep Neural Network (DNN) with Early Exit Classifiers (EECs), to provide predictions at intermediate points of the processing when enough confidence in classification is achieved. This leads to many benefits in terms of effectiveness and efficiency. Currently, the design of EENNs is carried out manually by experts, a complex and time-consuming task that requires accounting for many aspects, including the correct placement, the thresholding, and the computational overhead of the EECs. For this reason, the research is exploring the use of Neural Architecture Search (NAS) to automatize the design of EENNs. Currently, few comprehensive NAS solutions for EENNs have been proposed in the literature, and a fully automated, joint design strategy taking into consideration both the backbone and the EECs remains an open problem. To this end, this work presents Neural Architecture Search for Hardware Constrained Early Exit Neural Networks (NACHOS), the first NAS framework for the design of optimal EENNs satisfying constraints on the accuracy and the number of Multiply and Accumulate (MAC) operations performed by the EENNs at inference time. In particular, this provides the joint design of backbone and EECs to select a set of admissible (i.e., respecting the constraints) Pareto Optimal Solutions in terms of best tradeoff between the accuracy and number of MACs. The results show that the models designed by NACHOS are competitive with the state-of-the-art EENNs. Additionally, this work investigates the effectiveness of two novel regularization terms designed for the optimization of the auxiliary classifiers of the EENN

artificial intelligence, constraint, machine learning, (15 more...)

2401.1333

Country:

North America > Canada (0.28)
Europe > Italy (0.28)
North America > United States (0.28)

Genre:

Overview (0.67)
Research Report > New Finding (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.88)

arXiv.org Machine LearningFeb-11-2022

Continual Learning with Invertible Generative Models

Pomponi, Jary, Scardapane, Simone, Uncini, Aurelio

Catastrophic forgetting (CF) happens whenever a neural network overwrites past knowledge while being trained on new tasks. Common techniques to handle CF include regularization of the weights (using, e.g., their importance on past tasks), and rehearsal strategies, where the network is constantly re-trained on past data. Generative models have also been applied for the latter, in order to have endless sources of data. In this paper, we propose a novel method that combines the strengths of regularization and generative-based rehearsal approaches. Our generative model consists of a normalizing flow (NF), a probabilistic and invertible neural network, trained on the internal embeddings of the network. By keeping a single NF throughout the training process, we show that our memory overhead remains constant. In addition, exploiting the invertibility of the NF, we propose a simple approach to regularize the network's embeddings with respect to past tasks. We show that our method performs favorably with espect to state-of-the-art approaches in the literature, with bounded computational power and memory overheads.

artificial intelligence, machine learning, neural network, (17 more...)

2202.05694

Country: Europe > Italy (0.14)

Genre: Research Report > Promising Solution (0.68)

Industry: Education > Educational Setting (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

arXiv.org Machine LearningFeb-4-2022

Pixle: a fast and effective black-box attack based on rearranging pixels

Pomponi, Jary, Scardapane, Simone, Uncini, Aurelio

Recent research has found that neural networks are vulnerable to several types of adversarial attacks, where the input samples are modified in such a way that the model produces a wrong prediction that misclassifies the adversarial sample. In this paper we focus on black-box adversarial attacks, that can be performed without knowing the inner structure of the attacked model, nor the training procedure, and we propose a novel attack that is capable of correctly attacking a high percentage of samples by rearranging a small number of pixels within the attacked image. We demonstrate that our attack works on a large number of datasets and models, that it requires a small number of iterations, and that the distance between the original sample and the adversarial one is negligible to the human eye.

artificial intelligence, machine learning, pixel, (19 more...)

2202.02236

Country: Europe > Italy (0.14)

Genre: Research Report (0.40)

Industry:

Information Technology > Security & Privacy (0.69)
Transportation > Air (0.63)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Vision (0.89)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

arXiv.org Artificial IntelligenceMay-6-2021

Structured Ensembles: an Approach to Reduce the Memory Footprint of Ensemble Methods

Pomponi, Jary, Scardapane, Simone, Uncini, Aurelio

In this paper, we propose a novel ensembling technique for deep neural networks, which is able to drastically reduce the required memory compared to alternative approaches. In particular, we propose to extract multiple sub-networks from a single, untrained neural network by solving an end-to-end optimization task combining differentiable scaling over the original architecture, with multiple regularization terms favouring the diversity of the ensemble. Since our proposal aims to detect and extract sub-structures, we call it Structured Ensemble. On a large experimental evaluation, we show that our method can achieve higher or comparable accuracy to competing methods while requiring significantly less storage. In addition, we evaluate our ensembles in terms of predictive calibration and uncertainty, showing they compare favourably with the state-of-the-art. Finally, we draw a link with the continual learning literature, and we propose a modification of our framework to handle continuous streams of tasks with a sub-linear memory cost. We compare with a number of alternative strategies to mitigate catastrophic forgetting, highlighting advantages in terms of average accuracy and memory.

deep learning, ensemble, neural network, (15 more...)

2105.02551

Country: Europe > Italy (0.14)

Genre: Research Report > New Finding (0.46)

Industry: Education > Curriculum (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.66)

arXiv.org Artificial IntelligenceApr-1-2021

Avalanche: an End-to-End Library for Continual Learning

Lomonaco, Vincenzo, Pellegrini, Lorenzo, Cossu, Andrea, Carta, Antonio, Graffieti, Gabriele, Hayes, Tyler L., De Lange, Matthias, Masana, Marc, Pomponi, Jary, van de Ven, Gido, Mundt, Martin, She, Qi, Cooper, Keiland, Forest, Jeremy, Belouadah, Eden, Calderara, Simone, Parisi, German I., Cuzzolin, Fabio, Tolias, Andreas, Scardapane, Simone, Antiga, Luca, Amhad, Subutai, Popescu, Adrian, Kanan, Christopher, van de Weijer, Joost, Tuytelaars, Tinne, Bacciu, Davide, Maltoni, Davide

Learning continually from non-stationary data streams is a long-standing goal and a challenging problem in machine learning. Recently, we have witnessed a renewed and fast-growing interest in continual learning, especially within the deep learning community. However, algorithmic solutions are often difficult to re-implement, evaluate and port across different settings, where even results on standard benchmarks are hard to reproduce. In this work, we propose Avalanche, an open-source end-to-end library for continual learning research based on PyTorch. Avalanche is designed to provide a shared and collaborative codebase for fast prototyping, training, and reproducible evaluation Figure 1: Operational representation of Avalanche with its of continual learning algorithms.

avalanche, deep learning, neural network, (16 more...)

2104.00405

Country:

Europe > Sweden (0.28)
North America > United States (0.28)

Genre:

Instructional Material (0.46)
Research Report (0.40)

Industry: Education (0.68)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

arXiv.org Machine LearningNov-12-2020

Pseudo-Rehearsal for Continual Learning with Normalizing Flows

Pomponi, Jary, Scardapane, Simone, Uncini, Aurelio

Catastrophic forgetting (CF) happens whenever a neural network overwrites past knowledge while being trained on new tasks. Common techniques to handle CF include regularization of the weights (using, e.g., their importance on past tasks), and rehearsal strategies, where the network is constantly re-trained on past data. Generative models have also been applied for the latter, in order to have endless sources of data. In this paper, we propose a novel method that combines the strengths of regularization and generative-based rehearsal approaches. Our generative model consists of a normalizing flow (NF), a probabilistic and invertible neural network, trained on the internal embeddings of the network. By keeping a single NF conditioned on the task, we show that our memory overhead remains constant. In addition, exploiting the invertibility of the NF, we propose a simple approach to regularize the network's embeddings with respect to past tasks. We show that our method performs favorably with respect to state-of-the-art approaches in the literature, with bounded computational power and memory overheads.

artificial intelligence, generative model, neural network, (16 more...)

2007.02443

Country: Europe > Italy (0.14)

Genre: Research Report > Promising Solution (0.68)

Industry: Education > Educational Setting (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

arXiv.org Machine LearningSep-9-2019

Efficient Continual Learning in Neural Networks with Embedding Regularization

Pomponi, Jary, Scardapane, Simone, Lomonaco, Vincenzo, Uncini, Aurelio

Continual learning of deep neural networks is a key requirement for scaling them up to more complex applicative scenarios and for achieving real lifelong learning of these architectures. Previous approaches to the problem have considered either the progressive increase in the size of the networks, or have tried to regularize the network behavior to equalize it with respect to previously observed tasks. In the latter case, it is essential to understand what type of information best represents this past behavior. Common techniques include regularizing the past outputs, gradients, or individual weights. In this work, we propose a new, relatively simple and efficient method to perform continual learning by regularizing instead the network internal embeddings. To make the approach scalable, we also propose a dynamic sampling strategy to reduce the memory footprint of the required external storage. We show that our method performs favorably with respect to state-of-the-art approaches in the literature, while requiring significantly less space in memory and computational time. In addition, inspired inspired by to recent works, we evaluate the impact of selecting a more flexible model for the activation functions inside the network, evaluating the impact of catastrophic forgetting on the activation functions themselves.

continual learning, deep learning, neural network, (20 more...)

1909.03742

Country: Europe > Italy (0.28)

Genre: Research Report > Promising Solution (0.88)

Industry: Telecommunications (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)