logr
- North America > United States > New York (0.04)
- North America > United States > New Jersey > Hudson County > Hoboken (0.04)
- North America > United States > Massachusetts > Hampshire County > Amherst (0.04)
- (4 more...)
- North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
- Europe > France (0.04)
- North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
- Europe > France (0.04)
Stick-Breaking Mixture Normalizing Flows with Component-Wise Tail Adaptation for Variational Inference
Han, Seungsu, Hwang, Juyoung, Chang, Won
Normalizing flows with a Gaussian base provide a computationally efficient way to approximate posterior distributions in Bayesian inference, but they often struggle to capture complex posteriors with multimodality and heavy tails. We propose a stick-breaking mixture base with component-wise tail adaptation (StiCTAF) for posterior approximation. The method first learns a flexible mixture base to mitigate the mode-seeking bias of reverse KL divergence through a weighted average of component-wise ELBOs. It then estimates local tail indices of unnormalized densities and finally refines each mixture component using a shared backbone combined with component-specific tail transforms calibrated by the estimated indices. This design enables accurate mode coverage and anisotropic tail modeling while retaining exact density evaluation and stable optimization. Experiments on synthetic posteriors demonstrate improved tail recovery and better coverage of multiple modes compared to benchmark models. We also present a real-data analysis illustrating the practical benefits of our approach for posterior inference.
- Asia > South Korea > Seoul > Seoul (0.04)
- Asia > Middle East > Jordan (0.04)
- North America > United States > Tennessee (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
An inferential measure of dependence between two systems using Bayesian model comparison
Marrelec, Guillaume, Giron, Alain
We propose to quantify dependence between two systems $X$ and $Y$ in a dataset $D$ based on the Bayesian comparison of two models: one, $H_0$, of statistical independence and another one, $H_1$, of dependence. In this framework, dependence between $X$ and $Y$ in $D$, denoted $B(X,Y|D)$, is quantified as $P(H_1|D)$, the posterior probability for the model of dependence given $D$, or any strictly increasing function thereof. It is therefore a measure of the evidence for dependence between $X$ and $Y$ as modeled by $H_1$ and observed in $D$. We review several statistical models and reconsider standard results in the light of $B(X,Y|D)$ as a measure of dependence. Using simulations, we focus on two specific issues: the effect of noise and the behavior of $B(X,Y|D)$ when $H_1$ has a parameter coding for the intensity of dependence. We then derive some general properties of $B(X,Y|D)$, showing that it quantifies the information contained in $D$ in favor of $H_1$ versus $H_0$. While some of these properties are typical of what is expected from a valid measure of dependence, others are novel and naturally appear as desired features for specific measures of dependence, which we call inferential. We finally put these results in perspective; in particular, we discuss the consequences of using the Bayesian framework as well as the similarities and differences between $B(X,Y|D)$ and mutual information.
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
- North America > Canada > Ontario > Toronto (0.14)
- Europe > Czechia > Prague (0.04)
- (5 more...)
- Health & Medicine > Therapeutic Area > Neurology (1.00)
- Health & Medicine > Health Care Technology (1.00)
- Health & Medicine > Diagnostic Medicine (0.68)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
How to Measure Evidence: Bayes Factors or Relative Belief Ratios?
Al-Labadi, Luai, Alzaatreh, Ayman, Evans, Michael
One of the virtues of the Bayesianapproachto statistical analysisis that it gives an unambiguous definition of what it means for there to be evidence in favor of or against a particular value of a parameter. This is provided by the following principle. Principle of Evidence: if the posterior probability of an event is greater than (less than, equal to) its prior probability, then there is evidence in favor of (against, no evidence either way of) the event being true. This seems like a very simple and intuitively satisfying way of characterizing evidence and it has long been considered to be quite natural and obvious. For example, Popper (1968) The Logic of Scientific Discovery, Appendix ix "If we are asked to give a criterion of the fact that the evidence y supports or corroborates a statement x, the most obvious reply is: that y increases the probability of x." Achinstein (2001) "for a fact e to be evidence that a hypothesis h is true, it is both necessary and sufficient for e to increase h's probability over its prior probability".
- North America > United States > New York (0.04)
- North America > Canada > Ontario > Toronto (0.04)
- North America > United States > California > Santa Clara County > Palo Alto (0.04)
- (2 more...)
Information geometry for approximate Bayesian computation
The goal of this paper is to explore the basic Approximate Bayesian Computation (ABC) algorithm via the lens of information theory. ABC is a widely used algorithm in cases where the likelihood of the data is hard to work with or intractable, but one can simulate from it. We use relative entropy ideas to analyze the behavior of the algorithm as a function of the thresholding parameter and of the size of the data. Relative entropy here is data driven as it depends on the values of the observed statistics. We allow different thresholding parameters for each different direction (i.e. for different observed statistic) and compute the weighted effect on each direction. The latter allows to find important directions via sensitivity analysis leading to potentially larger acceptance regions, which in turn brings the computational cost of the algorithm down for the same level of accuracy. In addition, we also investigate the bias of the estimators for generic observables as a function of both the thresholding parameters and the size of the data. Our analysis provides error bounds on performance for positive tolerances and finite sample sizes. Simulation studies complement and illustrate the theoretical results.
Logistic Regression: A Concise Technical Overview
A popular statistical technique to predict binomial outcomes (y 0 or 1) is Logistic Regression. Logistic regression predicts categorical outcomes (binomial / multinomial values of y), whereas linear Regression is good for predicting continuous-valued outcomes (such as weight of a person in kg, the amount of rainfall in cm). The predictions of Logistic Regression (henceforth, LogR in this article) are in the form of probabilities of an event occurring, ie the probability of y 1, given certain values of input variables x. As shown in Figure1, the logit function on the right- with a range of - to, is the inverse of the logistic function shown on the left- with a range of 0 to 1. Estimating the values of B0,B1,..,Bk involves the concepts of probability, odds and log odds. The example dataset here is sourced from the UCLA website. The task is to predict which students graduated with honours or not (y 1 or 0), for 200 students with fields female, read, write, math, hon, femalexmath .
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (1.00)