martino
Effective sample size approximations as entropy measures
In this work, we analyze alternative effective sample size (ESS) metrics for importance sampling algorithms, and discuss a possible extended range of applications. We show the relationship between the ESS expressions used in the literature and two entropy families, the Rรฉnyi and Tsallis entropy. The Rรฉnyi entropy is connected to the Huggins-Roy's ESS family introduced in \cite{Huggins15}. We prove that that all the ESS functions included in the Huggins-Roy's family fulfill all the desirable theoretical conditions. We analyzed and remark the connections with several other fields, such as the Hill numbers introduced in ecology, the Gini inequality coefficient employed in economics, and the Gini impurity index used mainly in machine learning, to name a few. Finally, by numerical simulations, we study the performance of different ESS expressions contained in the previous ESS families in terms of approximation of the theoretical ESS definition, and show the application of ESS formulas in a variable selection problem.
Data-driven informative priors for Bayesian inference with quasi-periodic data
Lopez-Santiago, Javier, Martino, Luca, Miguez, Joaquin, Vazquez-Vilar, Gonzalo
Bayesian computational strategies for inference can be inefficient in approximating the posterior distribution in models that exhibit some form of periodicity. This is because the probability mass of the marginal posterior distribution of the parameter representing the period is usually highly concentrated in a very small region of the parameter space. Therefore, it is necessary to provide as much information as possible to the inference method through the parameter prior distribution. We intend to show that it is possible to construct a prior distribution from the data by fitting a Gaussian process (GP) with a periodic kernel. More specifically, we want to show that it is possible to approximate the marginal posterior distribution of the hyperparameter corresponding to the period in the kernel. Subsequently, this distribution can be used as a prior distribution for the inference method. We use an adaptive importance sampling method to approximate the posterior distribution of the hyperparameters of the GP. Then, we use the marginal posterior distribution of the hyperparameter related to the periodicity in order to construct a prior distribution for the period of the parametric model. This workflow is empirical Bayes, implemented as a modular (cut) transfer of a GP posterior for the period to the parametric model. We applied the proposed methodology to both synthetic and real data. We approximated the posterior distribution of the period of the GP kernel and then passed it forward as a posterior-as-prior with no feedback. Finally, we analyzed its impact on the marginal posterior distribution.
Error Analysis of Triangular Optimal Transport Maps for Filtering
Al-Jarrah, Mohammad, Hosseini, Bamdad, Jin, Niyizhen, Martino, Michele, Taghvaei, Amirhossein
We present a systematic analysis of estimation errors for a class of optimal transport based algorithms for filtering and data assimilation. Along the way, we extend previous error analyses of Brenier maps to the case of conditional Brenier maps that arise in the context of simulation based inference. We then apply these results in a filtering scenario to analyze the optimal transport filtering algorithm of Al-Jarrah et al. (2024, ICML). An extension of that algorithm along with numerical benchmarks on various non-Gaussian and high-dimensional examples are provided to demonstrate its effectiveness and practical potential.
Optimality in importance sampling: a gentle survey
Llorente, Fernando, Martino, Luca
Monte Carlo (MC) methods are powerful tools for numerical inference and optimization widely employed in statistics, signal processing and machine learning Liu (2004); Robert and Casella (2004). They are mainly used for computing approximately the solution of definite integrals, and by extension, of differential equations (for this reason, MC schemes can be considered stochastic quadrature rules). Although exact analytical solutions to integrals are always desirable, such unicorns are rarely available, specially in real-world systems. Many applications inevitably require the approximation of intractable integrals. Specifically, Bayesian methods need the computation of expectations with respect to posterior probability density function (pdf) which, generally, are analytically intractable Gelman et al. (2013). The MC methods can be divided in four main families: direct methods (based on transformations or random variables), accept-reject techniques, Markov chain Monte Carlo (MCMC) algorithms, and importance sampling (IS) schemes Luengo et al. (2020); Martino et al. (2018). The last two families are the most popular for the facility and universality of their possible application Liang et al. (2010); Liu (2004); Robert and Casella (2004). All the MC methods require the choice of a suitable proposal density that is crucial for their performance Luengo et al. (2020); Robert and Casella (2004).
Adaptive posterior distributions for uncertainty analysis of covariance matrices in Bayesian inversion problems for multioutput signals
Curbelo, E., Martino, L., Llorente, F., Delgado-Gomez, D.
In this paper we address the problem of performing Bayesian inference for the parameters of a nonlinear multi-output model and the covariance matrix of the different output signals. We propose an adaptive importance sampling (AIS) scheme for multivariate Bayesian inversion problems, which is based in two main ideas: the variables of interest are split in two blocks and the inference takes advantage of known analytical optimization formulas. We estimate both the unknown parameters of the multivariate non-linear model and the covariance matrix of the noise. In the first part of the proposed inference scheme, a novel AIS technique called adaptive target adaptive importance sampling (ATAIS) is designed, which alternates iteratively between an IS technique over the parameters of the non-linear model and a frequentist approach for the covariance matrix of the noise. In the second part of the proposed inference scheme, a prior density over the covariance matrix is considered and the cloud of samples obtained by ATAIS are recycled and re-weighted to obtain a complete Bayesian study over the model parameters and covariance matrix. ATAIS is the main contribution of the work. Additionally, the inverted layered importance sampling (ILIS) is presented as a possible compelling algorithm (but based on a conceptually simpler idea). Different numerical examples show the benefits of the proposed approaches
Efficient Bayes Inference in Neural Networks through Adaptive Importance Sampling
Huang, Yunshi, Chouzenoux, Emilie, Elvira, Victor, Pesquet, Jean-Christophe
Bayesian neural networks (BNNs) have received an increased interest in the last years. In BNNs, a complete posterior distribution of the unknown weight and bias parameters of the network is produced during the training stage. This probabilistic estimation offers several advantages with respect to point-wise estimates, in particular, the ability to provide uncertainty quantification when predicting new data. This feature inherent to the Bayesian paradigm, is useful in countless machine learning applications. It is particularly appealing in areas where decision-making has a crucial impact, such as medical healthcare or autonomous driving. The main challenge of BNNs is the computational cost of the training procedure since Bayesian techniques often face a severe curse of dimensionality. Adaptive importance sampling (AIS) is one of the most prominent Monte Carlo methodologies benefiting from sounded convergence guarantees and ease for adaptation. This work aims to show that AIS constitutes a successful approach for designing BNNs. More precisely, we propose a novel algorithm PMCnet that includes an efficient adaptation mechanism, exploiting geometric information on the complex (often multimodal) posterior distribution. Numerical results illustrate the excellent performance and the improved exploration capabilities of the proposed method for both shallow and deep neural networks.
Hamiltonian Adaptive Importance Sampling
Mousavi, Ali, Monsefi, Reza, Elvira, Vรญctor
Importance sampling (IS) is a powerful Monte Carlo (MC) methodology for approximating integrals, for instance in the context of Bayesian inference. In IS, the samples are simulated from the so-called proposal distribution, and the choice of this proposal is key for achieving a high performance. In adaptive IS (AIS) methods, a set of proposals is iteratively improved. AIS is a relevant and timely methodology although many limitations remain yet to be overcome, e.g., the curse of dimensionality in high-dimensional and multi-modal problems. Moreover, the Hamiltonian Monte Carlo (HMC) algorithm has become increasingly popular in machine learning and statistics. HMC has several appealing features such as its exploratory behavior, especially in high-dimensional targets, when other methods suffer. In this paper, we introduce the novel Hamiltonian adaptive importance sampling (HAIS) method. HAIS implements a two-step adaptive process with parallel HMC chains that cooperate at each iteration. The proposed HAIS efficiently adapts a population of proposals, extracting the advantages of HMC. HAIS can be understood as a particular instance of the generic layered AIS family with an additional resampling step. HAIS achieves a significant performance improvement in high-dimensional problems w.r.t. state-of-the-art algorithms. We discuss the statistical properties of HAIS and show its high performance in two challenging examples.
A Survey of Monte Carlo Methods for Parameter Estimation
Luengo, D., Martino, L., Bugallo, M., Elvira, V., Sรคrkkรค, S.
Statistical signal processing applications usually require the estimation of some parameters of interest given a set of observed data. These estimates are typically obtained either by solving a multi-variate optimization problem, as in the maximum likelihood (ML) or maximum a posteriori (MAP) estimators, or by performing a multi-dimensional integration, as in the minimum mean squared error (MMSE) estimators. Unfortunately, analytical expressions for these estimators cannot be found in most real-world applications, and the Monte Carlo (MC) methodology is one feasible approach. MC methods proceed by drawing random samples, either from the desired distribution or from a simpler one, and using them to compute consistent estimators. The most important families of MC algorithms are Markov chain MC (MCMC) and importance sampling (IS). On the one hand, MCMC methods draw samples from a proposal density, building then an ergodic Markov chain whose stationary distribution is the desired distribution by accepting or rejecting those candidate samples as the new state of the chain. On the other hand, IS techniques draw samples from a simple proposal density, and then assign them suitable weights that measure their quality in some appropriate way. In this paper, we perform a thorough review of MC methods for the estimation of static parameters in signal processing applications. A historical note on the development of MC schemes is also provided, followed by the basic MC method and a brief description of the rejection sampling (RS) algorithm, as well as three sections describing many of the most relevant MCMC and IS algorithms, and their combined use.
Deep Importance Sampling based on Regression for Model Inversion and Emulation
Llorente, F., Martino, L., Delgado, D., Camps-Valls, G.
Understanding systems by forward and inverse modeling is a recurrent topic of research in many domains of science and engineering. In this context, Monte Carlo methods have been widely used as powerful tools for numerical inference and optimization. They require the choice of a suitable proposal density that is crucial for their performance. For this reason, several adaptive importance sampling (AIS) schemes have been proposed in the literature. We here present an AIS framework called Regression-based Adaptive Deep Importance Sampling (RADIS). In RADIS, the key idea is the adaptive construction via regression of a non-parametric proposal density (i.e., an emulator), which mimics the posterior distribution and hence minimizes the mismatch between proposal and target densities. RADIS is based on a deep architecture of two (or more) nested IS schemes, in order to draw samples from the constructed emulator. The algorithm is highly efficient since employs the posterior approximation as proposal density, which can be improved adding more support points. As a consequence, RADIS asymptotically converges to an exact sampler under mild conditions. Additionally, the emulator produced by RADIS can be in turn used as a cheap surrogate model for further studies. We introduce two specific RADIS implementations that use Gaussian Processes (GPs) and Nearest Neighbors (NN) for constructing the emulator. Several numerical experiments and comparisons show the benefits of the proposed schemes. A real-world application in remote sensing model inversion and emulation confirms the validity of the approach.
Brain Damage Saved His Music - Issue 58: Self
Eight years ago, when neurosurgeon Marcelo Galarza saw images from jazz guitarist Pat Martino's cerebral MRI, he was astonished. "I couldn't believe how much of his left temporal lobe had been removed," he said. Martino had brain surgery in 1980 to remove a tangle of malformed veins and arteries. At the time he was one of the most celebrated guitarists in jazz. Yet few people knew that Martino suffered epileptic seizures, crushing headaches, and depression. Locked in psychiatric wards, he withstood debilitating electroshock therapy. It wasn't until 2007 that Martino had an MRI and not until recently that neuroscientists published their analyses of the images.