Plotting

 Handley, Will


Cosmological Parameter Estimation with Sequential Linear Simulation-based Inference

arXiv.org Machine Learning

However, the use of neural networks presents some disadvantages, the most significant of which is their lack of explainability. This means that most neural networks In many astrophysical applications, statistical models are treated as a'black box', where the decisions taken can be simulated forward, but their likelihood functions by the artificial intelligence in arriving at the optimized are too complex to calculate directly. Simulation-based solution are not known to researchers, which can hinder inference (SBI) [1] provides an alternative way to perform intellectual oversight [18]. This problem affects the Bayesian analysis on these models, relying solely on forward algorithms discussed above, as NRE constitutes an unsupervised simulations rather than likelihood estimates. However, learning task, where the artificial intelligence is modern cosmological models are typically expensive given unlabeled input data and allowed to discover patterns to simulate and datasets are often high-dimensional, in its distribution without guidance. This combines so traditional methods like the Approximate Bayesian with the problem of over-fitting, where the neural network Computation (ABC) [2], which scale poorly with dimensionality, may attempt to maximize the likelihood without are no longer suitable for parameter estimation.


A comparison of Bayesian sampling algorithms for high-dimensional particle physics and cosmology applications

arXiv.org Machine Learning

For several decades now, Bayesian inference techniques have been applied to theories of particle physics, cosmology and astrophysics to obtain the probability density functions of their free parameters. In this study, we review and compare a wide range of Markov Chain Monte Carlo (MCMC) and nested sampling techniques to determine their relative efficacy on functions that resemble those encountered most frequently in the particle astrophysics literature. Our first series of tests explores a series of high-dimensional analytic test functions that exemplify particular challenges, for example highly multimodal posteriors or posteriors with curving degeneracies. We then investigate two real physics examples, the first being a global fit of the $\Lambda$CDM model using cosmic microwave background data from the Planck experiment, and the second being a global fit of the Minimal Supersymmetric Standard Model using a wide variety of collider and astrophysics data. We show that several examples widely thought to be most easily solved using nested sampling approaches can in fact be more efficiently solved using modern MCMC algorithms, but the details of the implementation matter. Furthermore, we also provide a series of useful insights for practitioners of particle astrophysics and cosmology.


Improving Gradient-guided Nested Sampling for Posterior Inference

arXiv.org Machine Learning

Gaussian noise was then added to produce a noisy simulated data. Given the data, the posterior of a model (a pixelated image of the undistorted background source) could be calculated by adding the likelihood and the prior terms. Furthermore since the model is perfectly linear (and known) and the noise and the prior are Gaussian, the posterior is a high-dimensional Gaussian posterior that could be calculated analytically, allowing us to compare the samples drawn with GGNS with the analytic solution. Figure 2 shows a comparison between the true image, and its noise, and the one recovered by GGNS. We see that we can recover both the correct image, and the noise distribution. We emphasize that this is a uni-modal problem and that the experiment's goal is to demonstrate the capability of GGNS to sample in high dimensions (in this case, 256), such as images, and to test the agreement between the samples and a baseline analytic solution.


Kernel-, mean- and noise-marginalised Gaussian processes for exoplanet transits and $H_0$ inference

arXiv.org Machine Learning

Using a fully Bayesian approach, Gaussian Process regression is extended to include marginalisation over the kernel choice and kernel hyperparameters. In addition, Bayesian model comparison via the evidence enables direct kernel comparison. The calculation of the joint posterior was implemented with a transdimensional sampler which simultaneously samples over the discrete kernel choice and their hyperparameters by embedding these in a higher-dimensional space, from which samples are taken using nested sampling. This method was explored on synthetic data from exoplanet transit light curve simulations. The true kernel was recovered in the low noise region while no kernel was preferred for larger noise. Furthermore, inference of the physical exoplanet hyperparameters was conducted. In the high noise region, either the bias in the posteriors was removed, the posteriors were broadened or the accuracy of the inference was increased. In addition, the uncertainty in mean function predictive distribution increased due to the uncertainty in the kernel choice. Subsequently, the method was extended to marginalisation over mean functions and noise models and applied to the inference of the present-day Hubble parameter, $H_0$, from real measurements of the Hubble parameter as a function of redshift, derived from the cosmologically model-independent cosmic chronometer and {\Lambda}CDM-dependent baryon acoustic oscillation observations. The inferred $H_0$ values from the cosmic chronometers, baryon acoustic oscillations and combined datasets are $H_0$ = 66$\pm$6 km/s/Mpc, $H_0$ = 67$\pm$10 km/s/Mpc and $H_0$ = 69$\pm$6 km/s/Mpc, respectively. The kernel posterior of the cosmic chronometers dataset prefers a non-stationary linear kernel. Finally, the datasets are shown to be not in tension with ln(R)=12.17$\pm$0.02.


Piecewise Normalizing Flows

arXiv.org Artificial Intelligence

Normalizing flows are an established approach for modelling complex probability densities through invertible transformations from a base distribution. However, the accuracy with which the target distribution can be captured by the normalizing flow is strongly influenced by the topology of the base distribution. A mismatch between the topology of the target and the base can result in a poor performance, as is the case for multi-modal problems. A number of different works have attempted to modify the topology of the base distribution to better match the target, either through the use of Gaussian Mixture Models [Izmailov et al., 2020, Ardizzone et al., 2020, Hagemann and Neumayer, 2021] or learned accept/reject sampling [Stimper et al., 2022]. We introduce piecewise normalizing flows which divide the target distribution into clusters, with topologies that better match the standard normal base distribution, and train a series of flows to model complex multi-modal targets. The piecewise nature of the flows can be exploited to significantly reduce the computational cost of training through parallelization. We demonstrate the performance of the piecewise flows using standard benchmarks and compare the accuracy of the flows to the approach taken in Stimper et al. [2022] for modelling multi-modal distributions.


Nested sampling with any prior you like

arXiv.org Machine Learning

Nested sampling is an important tool for conducting Bayesian analysis in Astronomy and other fields, both for sampling complicated posterior distributions for parameter inference, and for computing marginal likelihoods for model comparison. One technical obstacle to using nested sampling in practice is the requirement (for most common implementations) that prior distributions be provided in the form of transformations from the unit hyper-cube to the target prior density. For many applications - particularly when using the posterior from one experiment as the prior for another - such a transformation is not readily available. In this letter we show that parametric bijectors trained on samples from a desired prior density provide a general-purpose method for constructing transformations from the uniform base density to a target prior, enabling the practical use of nested sampling under arbitrary priors. We demonstrate the use of trained bijectors in conjunction with nested sampling on a number of examples from cosmology.


Compromise-free Bayesian neural networks

arXiv.org Machine Learning

We conduct a thorough analysis of the relationship between the out-of-sample performance and the Bayesian evidence (marginal likelihood) of Bayesian neural networks (BNNs), as well as looking at the performance of ensembles of BNNs, both using the Boston housing dataset. Using the state-of-the-art in nested sampling, we numerically sample the full (non-Gaussian and multimodal) network posterior and obtain numerical estimates of the Bayesian evidence, considering network models with up to 156 trainable parameters. The networks have between zero and four hidden layers, either $\tanh$ or $ReLU$ activation functions, and with and without hierarchical priors. The ensembles of BNNs are obtained by determining the posterior distribution over networks, from the posterior samples of individual BNNs re-weighted by the associated Bayesian evidence values. There is good correlation between out-of-sample performance and evidence, as well as a remarkable symmetry between the evidence versus model size and out-of-sample performance versus model size planes. Networks with $ReLU$ activation functions have consistently higher evidences than those with $\tanh$ functions, and this is reflected in their out-of-sample performance. Ensembling over architectures acts to further improve performance relative to the individual BNNs.


Bayesian sparse reconstruction: a brute-force approach to astronomical imaging and machine learning

arXiv.org Machine Learning

We present a principled Bayesian framework for signal reconstruction, in which the signal is modelled by basis functions whose number (and form, if required) is determined by the data themselves. This approach is based on a Bayesian interpretation of conventional sparse reconstruction and regularisation techniques, in which sparsity is imposed through priors via Bayesian model selection. We demonstrate our method for noisy 1- and 2-dimensional signals, including astronomical images. Furthermore, by using a product-space approach, the number and type of basis functions can be treated as integer parameters and their posterior distributions sampled directly. We show that order-of-magnitude increases in computational efficiency are possible from this technique compared to calculating the Bayesian evidences separately, and that further computational gains are possible using it in combination with dynamic nested sampling. Our approach can be readily applied to neural networks, where it allows the network architecture to be determined by the data in a principled Bayesian manner by treating the number of nodes and hidden layers as parameters.