Goto

Collaborating Authors

 Piga, Dario


dynoGP: Deep Gaussian Processes for dynamic system identification

arXiv.org Machine Learning

In this work, we present a novel approach to system identification for dynamical systems, based on a specific class of Deep Gaussian Processes (Deep GPs). These models are constructed by interconnecting linear dynamic GPs (equivalent to stochastic linear time-invariant dynamical systems) and static GPs (to model static nonlinearities). Our approach combines the strengths of data-driven methods, such as those based on neural network architectures, with the ability to output a probability distribution. This offers a more comprehensive framework for system identification that includes uncertainty quantification. Using both simulated and real-world data, we demonstrate the effectiveness of the proposed approach.


Integrating Reinforcement Learning with Foundation Models for Autonomous Robotics: Methods and Perspectives

arXiv.org Artificial Intelligence

Foundation models (FMs), large deep learning models pre-trained on vast, unlabeled datasets, exhibit powerful capabilities in understanding complex patterns and generating sophisticated outputs. However, they often struggle to adapt to specific tasks. Reinforcement learning (RL), which allows agents to learn through interaction and feedback, offers a compelling solution. Integrating RL with FMs enables these models to achieve desired outcomes and excel at particular tasks. Additionally, RL can be enhanced by leveraging the reasoning and generalization capabilities of FMs. This synergy is revolutionizing various fields, including robotics. FMs, rich in knowledge and generalization, provide robots with valuable information, while RL facilitates learning and adaptation through real-world interactions. This survey paper comprehensively explores this exciting intersection, examining how these paradigms can be integrated to advance robotic intelligence. We analyze the use of foundation models as action planners, the development of robotics-specific foundation models, and the mutual benefits of combining FMs with RL. Furthermore, we present a taxonomy of integration approaches, including large language models, vision-language models, diffusion models, and transformer-based RL models. We also explore how RL can utilize world representations learned from FMs to enhance robotic task execution. Our survey aims to synthesize current research and highlight key challenges in robotic reasoning and control, particularly in the context of integrating FMs and RL--two rapidly evolving technologies. By doing so, we seek to spark future research and emphasize critical areas that require further investigation to enhance robotics. We provide an updated collection of papers based on our taxonomy, accessible on our open-source project website at: https://github.com/clmoro/Robotics-RL-FMs-Integration.


Enhanced Transformer architecture for in-context learning of dynamical systems

arXiv.org Artificial Intelligence

Recently introduced by some of the authors, the in-context identification paradigm aims at estimating, offline and based on synthetic data, a meta-model that describes the behavior of a whole class of systems. Once trained, this meta-model is fed with an observed input/output sequence (context) generated by a real system to predict its behavior in a zero-shot learning fashion. In this paper, we enhance the original meta-modeling framework through three key innovations: by formulating the learning task within a probabilistic framework; by managing non-contiguous context and query windows; and by adopting recurrent patching to effectively handle long context sequences. The efficacy of these modifications is demonstrated through a numerical example focusing on the Wiener-Hammerstein system class, highlighting the model's enhanced performance and scalability.


LightCPPgen: An Explainable Machine Learning Pipeline for Rational Design of Cell Penetrating Peptides

arXiv.org Artificial Intelligence

Cell-penetrating peptides (CPPs) are powerful vectors for the intracellular delivery of a diverse array of therapeutic molecules. Despite their potential, the rational design of CPPs remains a challenging task that often requires extensive experimental efforts and iterations. In this study, we introduce an innovative approach for the de novo design of CPPs, leveraging the strengths of machine learning (ML) and optimization algorithms. Our strategy, named LightCPPgen, integrates a LightGBM-based predictive model with a genetic algorithm (GA), enabling the systematic generation and optimization of CPP sequences. At the core of our methodology is the development of an accurate, efficient, and interpretable predictive model, which utilizes 20 explainable features to shed light on the critical factors influencing CPP translocation capacity. The CPP predictive model works synergistically with an optimization algorithm, which is tuned to enhance computational efficiency while maintaining optimization performance. The GA solutions specifically target the candidate sequences' penetrability score, while trying to maximize similarity with the original non-penetrating peptide in order to retain its original biological and physicochemical properties. By prioritizing the synthesis of only the most promising CPP candidates, LightCPPgen can drastically reduce the time and cost associated with wet lab experiments. In summary, our research makes a substantial contribution to the field of CPP design, offering a robust framework that combines ML and optimization techniques to facilitate the rational design of penetrating peptides, by enhancing the explainability and interpretability of the design process.


Model order reduction of deep structured state-space models: A system-theoretic approach

arXiv.org Artificial Intelligence

With a specific emphasis on control design objectives, achieving accurate system modeling with limited complexity is crucial in parametric system identification. The recently introduced deep structured state-space models (SSM), which feature linear dynamical blocks as key constituent components, offer high predictive performance. However, the learned representations often suffer from excessively large model orders, which render them unsuitable for control design purposes. The current paper addresses this challenge by means of system-theoretic model order reduction techniques that target the linear dynamical blocks of SSMs. We introduce two regularization terms which can be incorporated into the training loss for improved model order reduction. In particular, we consider modal $\ell_1$ and Hankel nuclear norm regularization to promote sparsity, allowing one to retain only the relevant states without sacrificing accuracy. The presented regularizers lead to advantages in terms of parsimonious representations and faster inference resulting from the reduced order models. The effectiveness of the proposed methodology is demonstrated using real-world ground vibration data from an aircraft.


Synthetic data generation for system identification: leveraging knowledge transfer from similar systems

arXiv.org Artificial Intelligence

This paper addresses the challenge of overfitting in the learning of dynamical systems by introducing a novel approach for the generation of synthetic data, aimed at enhancing model generalization and robustness in scenarios characterized by data scarcity. Central to the proposed methodology is the concept of knowledge transfer from systems within the same class. Specifically, synthetic data is generated through a pre-trained meta-model that describes a broad class of systems to which the system of interest is assumed to belong. Training data serves a dual purpose: firstly, as input to the pre-trained meta model to discern the system's dynamics, enabling the prediction of its behavior and thereby generating synthetic output sequences for new input sequences; secondly, in conjunction with synthetic data, to define the loss function used for model estimation. A validation dataset is used to tune a scalar hyper-parameter balancing the relative importance of training and synthetic data in the definition of the loss function. The same validation set can be also used for other purposes, such as early stopping during the training, fundamental to avoid overfitting in case of small-size training datasets. The efficacy of the approach is shown through a numerical example that highlights the advantages of integrating synthetic data into the system identification process.


BenchCloudVision: A Benchmark Analysis of Deep Learning Approaches for Cloud Detection and Segmentation in Remote Sensing Imagery

arXiv.org Artificial Intelligence

Satellites equipped with optical sensors capture high-resolution imagery, providing valuable insights into various environmental phenomena. In recent years, there has been a surge of research focused on addressing some challenges in remote sensing, ranging from water detection in diverse landscapes to the segmentation of mountainous and terrains. Ongoing investigations goals to enhance the precision and efficiency of satellite imagery analysis. Especially, there is a growing emphasis on developing methodologies for accurate water body detection, snow and clouds, important for environmental monitoring, resource management, and disaster response. Accurate remote sensing data analysis can be challenging due to the presence of clouds in optical sensor-based applications. The quality of resulting products such as applications and research is directly impacted by cloud detection, which plays a key role in the remote sensing data processing pipeline. This paper examines seven cutting-edge semantic segmentation and detection algorithms applied to clouds identification, conducting a benchmark analysis to evaluate their architectural approaches and identify the most performing ones. To increase the model's adaptability, critical elements including the type of imagery and the amount of spectral bands used during training are analyzed. Additionally, this research tries to produce machine learning algorithms that can perform cloud segmentation using only a few spectral bands, including RGB and RGBN-IR combinations. The model's flexibility for a variety of applications and user scenarios is assessed by using imagery from Sentinel-2 and Landsat-8 as datasets. The current study involves a thorough benchmark analysis, evaluating modern deep learning models for cloud detection in remote sensing imagery. The principal objective encompasses the provision of a meticulous and relative evaluation of these models, offering elucidations regarding their proficiencies, deficiencies, and potential deployment utility.


From system models to class models: An in-context learning paradigm

arXiv.org Artificial Intelligence

Is it possible to understand the intricacies of a dynamical system not solely from its input/output pattern, but also by observing the behavior of other systems within the same class? This central question drives the study presented in this paper. In response to this query, we introduce a novel paradigm for system identification, addressing two primary tasks: one-step-ahead prediction and multi-step simulation. Unlike conventional methods, we do not directly estimate a model for the specific system. Instead, we learn a meta model that represents a class of dynamical systems. This meta model is trained on a potentially infinite stream of synthetic data, generated by simulators whose settings are randomly extracted from a probability distribution. When provided with a context from a new system-specifically, an input/output sequence-the meta model implicitly discerns its dynamics, enabling predictions of its behavior. The proposed approach harnesses the power of Transformers, renowned for their \emph{in-context learning} capabilities. For one-step prediction, a GPT-like decoder-only architecture is utilized, whereas the simulation problem employs an encoder-decoder structure. Initial experimental results affirmatively answer our foundational question, opening doors to fresh research avenues in system identification.


On the adaptation of in-context learners for system identification

arXiv.org Artificial Intelligence

Abstract: In-context system identification aims at constructing meta-models to describe classes of systems, differently from traditional approaches that model single systems. This paradigm facilitates the leveraging of knowledge acquired from observing the behaviour of different, yet related dynamics. Through numerical examples, we demonstrate how meta-model adaptation can enhance predictive performance in three realistic scenarios: tailoring the meta-model to describe a specific system rather than a class; extending the meta-model to capture the behaviour of systems beyond the initial training class; and recalibrating the model for new prediction tasks. Results highlight the effectiveness of meta-model adaptation to achieve a more robust and versatile meta-learning framework for system identification. Keywords: System identification, Machine Learning, Deep Learning, Neural Networks, Meta-learning, Model adaptation. 1. INTRODUCTION of estimating a model of a specific dynamical system, a meta-model describing a whole class of such systems is learned.


Gradient-based bilevel optimization for multi-penalty Ridge regression through matrix differential calculus

arXiv.org Machine Learning

Common regularization algorithms for linear regression, such as LASSO and Ridge regression, rely on a regularization hyperparameter that balances the tradeoff between minimizing the fitting error and the norm of the learned model coefficients. As this hyperparameter is scalar, it can be easily selected via random or grid search optimizing a cross-validation criterion. However, using a scalar hyperparameter limits the algorithm's flexibility and potential for better generalization. In this paper, we address the problem of linear regression with l2-regularization, where a different regularization hyperparameter is associated with each input variable. We optimize these hyperparameters using a gradient-based approach, wherein the gradient of a cross-validation criterion with respect to the regularization hyperparameters is computed analytically through matrix differential calculus. Additionally, we introduce two strategies tailored for sparse model learning problems aiming at reducing the risk of overfitting to the validation data. Numerical examples demonstrate that our multi-hyperparameter regularization approach outperforms LASSO, Ridge, and Elastic Net regression. Moreover, the analytical computation of the gradient proves to be more efficient in terms of computational time compared to automatic differentiation, especially when handling a large number of input variables. Application to the identification of over-parameterized Linear Parameter-Varying models is also presented.