Goto

Collaborating Authors

 Bayesian Learning


Uncertainty Quantification and Propagation in Surrogate-based Bayesian Inference

arXiv.org Machine Learning

Simulations of complex phenomena are crucial in the natural sciences and engineering for different scenarios, e.g., for gaining system understanding, prediction of future scenarios, risk assessment, or system design. However, often they are based on complex ordinary differential equations or partial differential equations which may not have closed-form solutions and may have to be solved using expensive numerical methods. To overcome computational overhead, the field of surrogate models (Zhu and Zabaras, 2018; Gramacy, 2020; Lavin et al., 2021) has emerged which provide fast approximations of computationally expensive simulation. Examples are polynomial chaos expansion (Wiener, 1938; Sudret, 2008; Oladyshkin and Nowak, 2012; Bรผrkner et al., 2023), Gaussian processes (Rasmussen and Williams, 2005) or neural networks (Goodfellow et al., 2016). Recently there has been a great interest in applying surrogate models in relevant areas, for example in hydrology (Mohammadi et al., 2018; Tarakanov and Elsheikh, 2019; Zhang et al., 2020), in fluid dynamics (Meyer et al., 2021), in climate prediction (Kuehnert et al., 2022), or in systems biology (Renardy et al., 2018; Alden et al., 2020).


Bayesian data fusion with shared priors

arXiv.org Machine Learning

The integration of data and knowledge from several sources is known as data fusion. When data is only available in a distributed fashion or when different sensors are used to infer a quantity of interest, data fusion becomes essential. In Bayesian settings, a priori information of the unknown quantities is available and, possibly, present among the different distributed estimators. When the local estimates are fused, the prior knowledge used to construct several local posteriors might be overused unless the fusion node accounts for this and corrects it. In this paper, we analyze the effects of shared priors in Bayesian data fusion contexts. Depending on different common fusion rules, our analysis helps to understand the performance behavior as a function of the number of collaborative agents and as a consequence of different types of priors. The analysis is performed by using two divergences which are common in Bayesian inference, and the generality of the results allows to analyze very generic distributions. These theoretical results are corroborated through experiments in a variety of estimation and classification problems, including linear and nonlinear models, and federated learning schemes.


Multi-Frequency Joint Community Detection and Phase Synchronization

arXiv.org Machine Learning

This paper studies the joint community detection and phase synchronization problem on the \textit{stochastic block model with relative phase}, where each node is associated with an unknown phase angle. This problem, with a variety of real-world applications, aims to recover the cluster structure and associated phase angles simultaneously. We show this problem exhibits a \textit{``multi-frequency''} structure by closely examining its maximum likelihood estimation (MLE) formulation, whereas existing methods are not originated from this perspective. To this end, two simple yet efficient algorithms that leverage the MLE formulation and benefit from the information across multiple frequencies are proposed. The former is a spectral method based on the novel multi-frequency column-pivoted QR factorization. The factorization applied to the top eigenvectors of the observation matrix provides key information about the cluster structure and associated phase angles. The second approach is an iterative multi-frequency generalized power method, where each iteration updates the estimation in a matrix-multiplication-then-projection manner. Numerical experiments show that our proposed algorithms significantly improve the ability of exactly recovering the cluster structure and the accuracy of the estimated phase angles, compared to state-of-the-art algorithms.


Learning to sample in Cartesian MRI

arXiv.org Artificial Intelligence

Despite its exceptional soft tissue contrast, Magnetic Resonance Imaging (MRI) faces the challenge of long scanning times compared to other modalities like X-ray radiography. Shortening scanning times is crucial in clinical settings, as it increases patient comfort, decreases examination costs and improves throughput. Recent advances in compressed sensing (CS) and deep learning allow accelerated MRI acquisition by reconstructing high-quality images from undersampled data. While reconstruction algorithms have received most of the focus, designing acquisition trajectories to optimize reconstruction quality remains an open question. This thesis explores two approaches to address this gap in the context of Cartesian MRI. First, we propose two algorithms, lazy LBCS and stochastic LBCS, that significantly improve upon G\"ozc\"u et al.'s greedy learning-based CS (LBCS) approach. These algorithms scale to large, clinically relevant scenarios like multi-coil 3D MR and dynamic MRI, previously inaccessible to LBCS. Additionally, we demonstrate that generative adversarial networks (GANs) can serve as a natural criterion for adaptive sampling by leveraging variance in the measurement domain to guide acquisition. Second, we delve into the underlying structures or assumptions that enable mask design algorithms to perform well in practice. Our experiments reveal that state-of-the-art deep reinforcement learning (RL) approaches, while capable of adaptation and long-horizon planning, offer only marginal improvements over stochastic LBCS, which is neither adaptive nor does long-term planning. Altogether, our findings suggest that stochastic LBCS and similar methods represent promising alternatives to deep RL. They shine in particular by their scalability and computational efficiency and could be key in the deployment of optimized acquisition trajectories in Cartesian MRI.


Enhancing Polynomial Chaos Expansion Based Surrogate Modeling using a Novel Probabilistic Transfer Learning Strategy

arXiv.org Machine Learning

In the field of surrogate modeling, polynomial chaos expansion (PCE) allows practitioners to construct inexpensive yet accurate surrogates to be used in place of the expensive forward model simulations. For black-box simulations, non-intrusive PCE allows the construction of these surrogates using a set of simulation response evaluations. In this context, the PCE coefficients can be obtained using linear regression, which is also known as point collocation or stochastic response surfaces. Regression exhibits better scalability and can handle noisy function evaluations in contrast to other non-intrusive approaches, such as projection. However, since over-sampling is generally advisable for the linear regression approach, the simulation requirements become prohibitive for expensive forward models. We propose to leverage transfer learning whereby knowledge gained through similar PCE surrogate construction tasks (source domains) is transferred to a new surrogate-construction task (target domain) which has a limited number of forward model simulations (training data). The proposed transfer learning strategy determines how much, if any, information to transfer using new techniques inspired by Bayesian modeling and data assimilation. The strategy is scrutinized using numerical investigations and applied to an engineering problem from the oil and gas industry.


A Block Metropolis-Hastings Sampler for Controllable Energy-based Text Generation

arXiv.org Artificial Intelligence

Recent work has shown that energy-based language modeling is an effective framework for controllable text generation because it enables flexible integration of arbitrary discriminators. However, because energy-based LMs are globally normalized, approximate techniques like Metropolis-Hastings (MH) are required for inference. Past work has largely explored simple proposal distributions that modify a single token at a time, like in Gibbs sampling. In this paper, we develop a novel MH sampler that, in contrast, proposes re-writes of the entire sequence in each step via iterative prompting of a large language model. Our new sampler (a) allows for more efficient and accurate sampling from a target distribution and (b) allows generation length to be determined through the sampling procedure rather than fixed in advance, as past work has required. We perform experiments on two controlled generation tasks, showing both downstream performance gains and more accurate target distribution sampling in comparison with single-token proposal techniques.


Distributed Bayesian Estimation in Sensor Networks: Consensus on Marginal Densities

arXiv.org Artificial Intelligence

In this paper, we aim to design and analyze distributed Bayesian estimation algorithms for sensor networks. The challenges we address are to (i) derive a distributed provably-correct algorithm in the functional space of probability distributions over continuous variables, and (ii) leverage these results to obtain new distributed estimators restricted to subsets of variables observed by individual agents. This relates to applications such as cooperative localization and federated learning, where the data collected at any agent depends on a subset of all variables of interest. We present Bayesian density estimation algorithms using data from non-linear likelihoods at agents in centralized, distributed, and marginal distributed settings. After setting up a distributed estimation objective, we prove almost-sure convergence to the optimal set of pdfs at each agent. Then, we prove the same for a storage-aware algorithm estimating densities only over relevant variables at each agent. Finally, we present a Gaussian version of these algorithms and implement it in a mapping problem using variational inference to handle non-linear likelihood models associated with LiDAR sensing.


Luck, skill, and depth of competition in games and social hierarchies

arXiv.org Machine Learning

Patterns of wins and losses in pairwise contests, such as occur in sports and games, consumer research and paired comparison studies, and human and animal social hierarchies, are commonly analyzed using probabilistic models that allow one to quantify the strength of competitors or predict the outcome of future contests. Here we generalize this approach to incorporate two additional features: an element of randomness or luck that leads to upset wins, and a "depth of competition" variable that measures the complexity of a game or hierarchy. Fitting the resulting model to a large collection of data sets we estimate depth and luck in a range of games, sports, and social situations. In general, we find that social competition tends to be "deep," meaning it has a pronounced hierarchy with many distinct levels, but also that there is often a nonzero chance of an upset victory, meaning that dominance challenges can be won even by significant underdogs. Competition in sports and games, by contrast, tends to be shallow and in most cases there is little evidence of upset wins, beyond those already implied by the shallowness of the hierarchy.


Statistical Guarantees for Variational Autoencoders using PAC-Bayesian Theory

arXiv.org Machine Learning

Since their inception, Variational Autoencoders (VAEs) have become central in machine learning. Despite their widespread use, numerous questions regarding their theoretical properties remain open. Using PAC-Bayesian theory, this work develops statistical guarantees for VAEs. First, we derive the first PAC-Bayesian bound for posterior distributions conditioned on individual samples from the data-generating distribution. Then, we utilize this result to develop generalization guarantees for the VAE's reconstruction loss, as well as upper bounds on the distance between the input and the regenerated distributions. More importantly, we provide upper bounds on the Wasserstein distance between the input distribution and the distribution defined by the VAE's generative model.


A Survey on Radar-Based Fall Detection

arXiv.org Artificial Intelligence

Fall detection, particularly critical for high-risk demographics like the elderly, is a key public health concern where timely detection can greatly minimize harm. With the advancements in radio frequency technology, radar has emerged as a powerful tool for human detection and tracking. Traditional machine learning algorithms, such as Support Vector Machines (SVM) and k-Nearest Neighbors (kNN), have shown promising outcomes. However, deep learning approaches, notably Convolutional Neural Networks (CNN) and Recurrent Neural Networks (RNN), have outperformed in learning intricate features and managing large, unstructured datasets. This survey offers an in-depth analysis of radar-based fall detection, with emphasis on Micro-Doppler, Range-Doppler, and Range-Doppler-Angles techniques. We discuss the intricacies and challenges in fall detection and emphasize the necessity for a clear definition of falls and appropriate detection criteria, informed by diverse influencing factors. We present an overview of radar signal processing principles and the underlying technology of radar-based fall detection, providing an accessible insight into machine learning and deep learning algorithms. After examining 74 research articles on radar-based fall detection published since 2000, we aim to bridge current research gaps and underscore the potential future research strategies, emphasizing the real-world applications possibility and the unexplored potential of deep learning in improving radar-based fall detection.