Goto

Collaborating Authors

 Tempone, Raúl


Filtered Markovian Projection: Dimensionality Reduction in Filtering for Stochastic Reaction Networks

arXiv.org Machine Learning

Stochastic reaction networks (SRNs) model stochastic effects for various applications, including intracellular chemical or biological processes and epidemiology. A typical challenge in practical problems modeled by SRNs is that only a few state variables can be dynamically observed. Given the measurement trajectories, one can estimate the conditional probability distribution of unobserved (hidden) state variables by solving a stochastic filtering problem. In this setting, the conditional distribution evolves over time according to an extensive or potentially infinite-dimensional system of coupled ordinary differential equations with jumps, known as the filtering equation. The current numerical filtering techniques, such as the Filtered Finite State Projection (DAmbrosio et al., 2022), are hindered by the curse of dimensionality, significantly affecting their computational performance. To address these limitations, we propose to use a dimensionality reduction technique based on the Markovian projection (MP), initially introduced for forward problems (Ben Hammouda et al., 2024). In this work, we explore how to adapt the existing MP approach to the filtering problem and introduce a novel version of the MP, the Filtered MP, that guarantees the consistency of the resulting estimator. The novel method combines a particle filter with reduced variance and solving the filtering equations in a low-dimensional space, exploiting the advantages of both approaches. The analysis and empirical results highlight the superior computational efficiency of projection methods compared to the existing filtered finite state projection in the large dimensional setting.


Adaptive Random Fourier Features Training Stabilized By Resampling With Applications in Image Regression

arXiv.org Artificial Intelligence

This paper presents an enhanced adaptive random Fourier features (ARFF) training algorithm for shallow neural networks, building upon the work introduced in "Adaptive Random Fourier Features with Metropolis Sampling", Kammonen et al., \emph{Foundations of Data Science}, 2(3):309--332, 2020. This improved method uses a particle filter-type resampling technique to stabilize the training process and reduce the sensitivity to parameter choices. The Metropolis test can also be omitted when resampling is used, reducing the number of hyperparameters by one and reducing the computational cost per iteration compared to the ARFF method. We present comprehensive numerical experiments demonstrating the efficacy of the proposed algorithm in function regression tasks as a stand-alone method and as a pretraining step before gradient-based optimization, using the Adam optimizer. Furthermore, we apply the proposed algorithm to a simple image regression problem, illustrating its utility in sampling frequencies for the random Fourier features (RFF) layer of coordinate-based multilayer perceptrons. In this context, we use the proposed algorithm to sample the parameters of the RFF layer in an automated manner.


Comparing Spectral Bias and Robustness For Two-Layer Neural Networks: SGD vs Adaptive Random Fourier Features

arXiv.org Artificial Intelligence

We present experimental results highlighting two key differences resulting from the choice of training algorithm for two-layer neural networks. The spectral bias of neural networks is well known, while the spectral bias dependence on the choice of training algorithm is less studied. Our experiments demonstrate that an adaptive random Fourier features algorithm (ARFF) can yield a spectral bias closer to zero compared to the stochastic gradient descent optimizer (SGD). Additionally, we train two identically structured classifiers, employing SGD and ARFF, to the same accuracy levels and empirically assess their robustness against adversarial noise attacks.


Scalable method for Bayesian experimental design without integrating over posterior distribution

arXiv.org Machine Learning

We address the computational efficiency in solving the A-optimal Bayesian design of experiments problems for which the observational map is based on partial differential equations and, consequently, is computationally expensive to evaluate. A-optimality is a widely used and easy-to-interpret criterion for Bayesian experimental design. This criterion seeks the optimal experimental design by minimizing the expected conditional variance, which is also known as the expected posterior variance. This study presents a novel likelihood-free approach to the A-optimal experimental design that does not require sampling or integrating the Bayesian posterior distribution. The expected conditional variance is obtained via the variance of the conditional expectation using the law of total variance, and we take advantage of the orthogonal projection property to approximate the conditional expectation. We derive an asymptotic error estimation for the proposed estimator of the expected conditional variance and show that the intractability of the posterior distribution does not affect the performance of our approach. We use an artificial neural network (ANN) to approximate the nonlinear conditional expectation in the implementation of our method. We then extend our approach for dealing with the case that the domain of experimental design parameters is continuous by integrating the training process of the ANN into minimizing the expected conditional variance. Through numerical experiments, we demonstrate that our method greatly reduces the number of observation model evaluations compared with widely used importance sampling-based approaches. This reduction is crucial, considering the high computational cost of the observational models. Code is available at https://github.com/vinh-tr-hoang/DOEviaPACE.


Nonlinear Isometric Manifold Learning for Injective Normalizing Flows

arXiv.org Artificial Intelligence

Some of the published Normalizing flows are deep generative models approaches assume a dimensionality-reducing (DGM) that represent the probability distribution map to be known and available a priori [12, 13]. of high-dimensional data sets as a change of variables Other works use compositions of manifold learning of a multivariate Gaussian [1, 2]. Using the inverse models and normalizing flows that are trained simultaneously, of this transformation, normalizing flows can e.g., the M-Flow [6], Noisy Injective compute the probability density functions (PDFs) Flows [14], piecewise injective flows called Trumpets explicitly, thus enabling training via the statistically [15], and neural manifold ordinary differential consistent and asymptotically efficient [3] likelihood equations [16].


On the equivalence of different adaptive batch size selection strategies for stochastic gradient descent methods

arXiv.org Machine Learning

In this study, we demonstrate that the norm test and inner product/orthogonality test presented in \cite{Bol18} are equivalent in terms of the convergence rates associated with Stochastic Gradient Descent (SGD) methods if $\epsilon^2=\theta^2+\nu^2$ with specific choices of $\theta$ and $\nu$. Here, $\epsilon$ controls the relative statistical error of the norm of the gradient while $\theta$ and $\nu$ control the relative statistical error of the gradient in the direction of the gradient and in the direction orthogonal to the gradient, respectively. Furthermore, we demonstrate that the inner product/orthogonality test can be as inexpensive as the norm test in the best case scenario if $\theta$ and $\nu$ are optimally selected, but the inner product/orthogonality test will never be more computationally affordable than the norm test if $\epsilon^2=\theta^2+\nu^2$. Finally, we present two stochastic optimization problems to illustrate our results.


Machine learning-based conditional mean filter: a generalization of the ensemble Kalman filter for nonlinear data assimilation

arXiv.org Machine Learning

Filtering is a data assimilation technique that performs the sequential inference of dynamical systems states from noisy observations. Herein, we propose a machine learning-based ensemble conditional mean filter (ML-EnCMF) for tracking possibly high-dimensional non-Gaussian state models with nonlinear dynamics based on sparse observations. The proposed filtering method is developed based on the conditional expectation and numerically implemented using machine learning (ML) techniques combined with the ensemble method. The contribution of this work is twofold. First, we demonstrate that the ensembles assimilated using the ensemble conditional mean filter (EnCMF) provide an unbiased estimator of the Bayesian posterior mean, and their variance matches the expected conditional variance. Second, we implement the EnCMF using artificial neural networks, which have a significant advantage in representing nonlinear functions over high-dimensional domains such as the conditional mean. Finally, we demonstrate the effectiveness of the ML-EnCMF for tracking the states of Lorenz-63 and Lorenz-96 systems under the chaotic regime. Numerical results show that the ML-EnCMF outperforms the ensemble Kalman filter.


Wind Field Reconstruction with Adaptive Random Fourier Features

arXiv.org Machine Learning

We investigate the use of spatial interpolation methods for reconstructing the horizontal near-surface wind field given a sparse set of measurements. In particular, random Fourier features is compared to a set of benchmark methods including Kriging and Inverse distance weighting. Random Fourier features is a linear model $\beta(\pmb x) = \sum_{k=1}^K \beta_k e^{i\omega_k \pmb x}$ approximating the velocity field, with frequencies $\omega_k$ randomly sampled and amplitudes $\beta_k$ trained to minimize a loss function. We include a physically motivated divergence penalty term $|\nabla \cdot \beta(\pmb x)|^2$, as well as a penalty on the Sobolev norm. We derive a bound on the generalization error and derive a sampling density that minimizes the bound. Following (arXiv:2007.10683 [math.NA]), we devise an adaptive Metropolis-Hastings algorithm for sampling the frequencies of the optimal distribution. In our experiments, our random Fourier features model outperforms the benchmark models.