Goto

Collaborating Authors

 Bayesian Inference


Hierarchic Flows to Estimate and Sample High-dimensional Probabilities

arXiv.org Machine Learning

Finding low-dimensional interpretable models of complex physical fields such as turbulence remains an open question, 80 years after the pioneer work of Kolmogorov. Estimating high-dimensional probability distributions from data samples suffers from an optimization and an approximation curse of dimensionality. It may be avoided by following a hierarchic probability flow from coarse to fine scales. This inverse renormalization group is defined by conditional probabilities across scales, renormalized in a wavelet basis. For a $\varphi^4$ scalar potential, sampling these hierarchic models avoids the critical slowing down at the phase transition. An outstanding issue is to also approximate non-Gaussian fields having long-range interactions in space and across scales. We introduce low-dimensional models with robust multiscale approximations of high order polynomial energies. They are calculated with a second wavelet transform, which defines interactions over two hierarchies of scales. We estimate and sample these wavelet scattering models to generate 2D vorticity fields of turbulence, and images of dark matter densities.


Deep Learning and genetic algorithms for cosmological Bayesian inference speed-up

arXiv.org Machine Learning

In this paper, we present a novel approach to accelerate the Bayesian inference process, focusing specifically on the nested sampling algorithms. Bayesian inference plays a crucial role in cosmological parameter estimation, providing a robust framework for extracting theoretical insights from observational data. However, its computational demands can be substantial, primarily due to the need for numerous likelihood function evaluations. Our proposed method utilizes the power of deep learning, employing feedforward neural networks to approximate the likelihood function dynamically during the Bayesian inference process. Unlike traditional approaches, our method trains neural networks on-the-fly using the current set of live points as training data, without the need for pre-training. This flexibility enables adaptation to various theoretical models and datasets. We perform simple hyperparameter optimization using genetic algorithms to suggest initial neural network architectures for learning each likelihood function. Once sufficient accuracy is achieved, the neural network replaces the original likelihood function. The implementation integrates with nested sampling algorithms and has been thoroughly evaluated using both simple cosmological dark energy models and diverse observational datasets. Additionally, we explore the potential of genetic algorithms for generating initial live points within nested sampling inference, opening up new avenues for enhancing the efficiency and effectiveness of Bayesian inference methods.


Mixture of partially linear experts

arXiv.org Machine Learning

In the mixture of experts model, a common assumption is the linearity between a response variable and covariates. While this assumption has theoretical and computational benefits, it may lead to suboptimal estimates by overlooking potential nonlinear relationships among the variables. To address this limitation, we propose a partially linear structure that incorporates unspecified functions to capture nonlinear relationships. We establish the identifiability of the proposed model under mild conditions and introduce a practical estimation algorithm. We present the performance of our approach through numerical studies, including simulations and real data analysis.


Active Preference Learning for Ordering Items In- and Out-of-sample

arXiv.org Machine Learning

Learning an ordering of items based on noisy pairwise comparisons is useful when item-specific labels are difficult to assign, for example, when annotators have to make subjective assessments. Algorithms have been proposed for actively sampling comparisons of items to minimize the number of annotations necessary for learning an accurate ordering. However, many ignore shared structure between items, treating them as unrelated, limiting sample efficiency and precluding generalization to new items. In this work, we study active learning with pairwise preference feedback for ordering items with contextual attributes, both in- and out-of-sample. We give an upper bound on the expected ordering error incurred by active learning strategies under a logistic preference model, in terms of the aleatoric and epistemic uncertainty in comparisons, and propose two algorithms designed to greedily minimize this bound. We evaluate these algorithms in two realistic image ordering tasks, including one with comparisons made by human annotators, and demonstrate superior sample efficiency compared to non-contextual ranking approaches and active preference learning baselines.


Riemannian Optimization for Active Mapping with Robot Teams

arXiv.org Artificial Intelligence

Autonomous exploration of unknown environments using a team of mobile robots demands distributed perception and planning strategies to enable efficient and scalable performance. Ideally, each robot should update its map and plan its motion not only relying on its own observations, but also considering the observations of its peers. Centralized solutions to multi-robot coordination are susceptible to central node failure and require a sophisticated communication infrastructure for reliable operation. Current decentralized active mapping methods consider simplistic robot models with linear-Gaussian observations and Euclidean robot states. In this work, we present a distributed multi-robot mapping and planning method, called Riemannian Optimization for Active Mapping (ROAM). We formulate an optimization problem over a graph with node variables belonging to a Riemannian manifold and a consensus constraint requiring feasible solutions to agree on the node variables. We develop a distributed Riemannian optimization algorithm that relies only on one-hop communication to solve the problem with consensus and optimality guarantees. We show that multi-robot active mapping can be achieved via two applications of our distributed Riemannian optimization over different manifolds: distributed estimation of a 3-D semantic map and distributed planning of SE(3) trajectories that minimize map uncertainty. We demonstrate the performance of ROAM in simulation and real-world experiments using a team of robots with RGB-D cameras.


Mathematics of statistical sequential decision-making: concentration, risk-awareness and modelling in stochastic bandits, with applications to bariatric surgery

arXiv.org Machine Learning

This thesis aims to study some of the mathematical challenges that arise in the analysis of statistical sequential decision-making algorithms for postoperative patients follow-up. Stochastic bandits (multiarmed, contextual) model the learning of a sequence of actions (policy) by an agent in an uncertain environment in order to maximise observed rewards. To learn optimal policies, bandit algorithms have to balance the exploitation of current knowledge and the exploration of uncertain actions. Such algorithms have largely been studied and deployed in industrial applications with large datasets, low-risk decisions and clear modelling assumptions, such as clickthrough rate maximisation in online advertising. By contrast, digital health recommendations call for a whole new paradigm of small samples, risk-averse agents and complex, nonparametric modelling. To this end, we developed new safe, anytime-valid concentration bounds, (Bregman, empirical Chernoff), introduced a new framework for risk-aware contextual bandits (with elicitable risk measures) and analysed a novel class of nonparametric bandit algorithms under weak assumptions (Dirichlet sampling). In addition to the theoretical guarantees, these results are supported by in-depth empirical evidence. Finally, as a first step towards personalised postoperative follow-up recommendations, we developed with medical doctors and surgeons an interpretable machine learning model to predict the long-term weight trajectories of patients after bariatric surgery.


Detecting and Deterring Manipulation in a Cognitive Hierarchy

arXiv.org Artificial Intelligence

Social agents with finitely nested opponent models are vulnerable to manipulation by agents with deeper reasoning and more sophisticated opponent modelling. This imbalance, rooted in logic and the theory of recursive modelling frameworks, cannot be solved directly. We propose a computational framework, $\aleph$-IPOMDP, augmenting model-based RL agents' Bayesian inference with an anomaly detection algorithm and an out-of-belief policy. Our mechanism allows agents to realize they are being deceived, even if they cannot understand how, and to deter opponents via a credible threat. We test this framework in both a mixed-motive and zero-sum game. Our results show the $\aleph$ mechanism's effectiveness, leading to more equitable outcomes and less exploitation by more sophisticated agents. We discuss implications for AI safety, cybersecurity, cognitive science, and psychiatry.


Semantic Scaling: Bayesian Ideal Point Estimates with Large Language Models

arXiv.org Artificial Intelligence

This paper introduces "Semantic Scaling," a novel method for ideal point estimation from text. I leverage large language models to classify documents based on their expressed stances and extract survey-like data. I then use item response theory to scale subjects from these data. Semantic Scaling significantly improves on existing text-based scaling methods, and allows researchers to explicitly define the ideological dimensions they measure. This represents the first scaling approach that allows such flexibility outside of survey instruments and opens new avenues of inquiry for populations difficult to survey. Additionally, it works with documents of varying length, and produces valid estimates of both mass and elite ideology. I demonstrate that the method can differentiate between policy preferences and in-group/out-group affect. Among the public, Semantic Scaling out-preforms Tweetscores according to human judgement; in Congress, it recaptures the first dimension DW-NOMINATE while allowing for greater flexibility in resolving construct validity challenges.


A Unified Framework for Human-Allied Learning of Probabilistic Circuits

arXiv.org Artificial Intelligence

Probabilistic Circuits (PCs) have emerged as an efficient framework for representing and learning complex probability distributions. Nevertheless, the existing body of research on PCs predominantly concentrates on data-driven parameter learning, often neglecting the potential of knowledge-intensive learning, a particular issue in data-scarce/knowledge-rich domains such as healthcare. To bridge this gap, we propose a novel unified framework that can systematically integrate diverse domain knowledge into the parameter learning process of PCs. Experiments on several benchmarks as well as real world datasets show that our proposed framework can both effectively and efficiently leverage domain knowledge to achieve superior performance compared to purely data-driven learning approaches.


Implicit Neural Representations for Robust Joint Sparse-View CT Reconstruction

arXiv.org Artificial Intelligence

Computed Tomography (CT) is pivotal in industrial quality control and medical diagnostics. Sparse-view CT, offering reduced ionizing radiation, faces challenges due to its under-sampled nature, leading to ill-posed reconstruction problems. Recent advancements in Implicit Neural Representations (INRs) have shown promise in addressing sparse-view CT reconstruction. Recognizing that CT often involves scanning similar subjects, we propose a novel approach to improve reconstruction quality through joint reconstruction of multiple objects using INRs. This approach can potentially leverage both the strengths of INRs and the statistical regularities across multiple objects. While current INR joint reconstruction techniques primarily focus on accelerating convergence via meta-initialization, they are not specifically tailored to enhance reconstruction quality. To address this gap, we introduce a novel INR-based Bayesian framework integrating latent variables to capture the inter-object relationships. These variables serve as a dynamic reference throughout the optimization, thereby enhancing individual reconstruction fidelity. Our extensive experiments, which assess various key factors such as reconstruction quality, resistance to overfitting, and generalizability, demonstrate significant improvements over baselines in common numerical metrics. This underscores a notable advancement in CT reconstruction methods.