AITopics

2307.13917

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
Europe > Germany > Bavaria > Upper Bavaria > Munich (0.04)
Europe > Sweden > Stockholm > Stockholm (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Chandramoorthy, Nisha, Schaefer, Florian, Marzouk, Youssef

Score Operator Newton transport

arXiv.org Artificial IntelligenceDec-8-2023

Generating samples from a complex (e.g., non-Gaussian, high-dimensional) probability distribution is a core computational challenge in diverse applications, ranging from computational statistics and machine learning to molecular simulation. A recurring setting is where the density ρ of the target distribution is specified up to a normalizing constant--for example, in Bayesian modeling, where ρ represents the posterior density. Here, evaluations of the score log ρ are often available as well, even for complex statistical models [Villa et al., 2021]. Alternatively, many new methods enable effective score estimation from data, without explicit density estimation; examples include score estimation from time series observations in chaotic dynamical systems [Chandramoorthy and Wang, 2022, Ni, 2020] and score-based modeling of image distributions [Song et al., 2020b,a]. In these settings, transport or "flow"-driven algorithms for generating samples have seen extensive success. The central idea is to construct a transport map from a simple, prescribed source distribution to the target distribution of interest. One class of transport approaches, e.g., as represented by variational inference with normalizing flows, involves constructing a parametric class of invertible maps and minimizing some statistical divergence between the pushforward (see Section 2) of the source by a member of this class and the target. A different, essentially nonparametric, class of transport approaches are based on particle systems, e.g., Stein variational gradient descent (SVGD)

iteration, operator, transport map, (15 more...)

2305.09792

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > Middle East > Jordan (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.66)
(2 more...)

Kim, Jungeum, Rockova, Veronika

Deep Bayes Factors

The is no other model or hypothesis verification tool in Bayesian statistics that is as widely used as the Bayes factor. We focus on generative models that are likelihood-free and, therefore, render the computation of Bayes factors (marginal likelihood ratios) far from obvious. We propose a deep learning estimator of the Bayes factor based on simulated data from two competing models using the likelihood ratio trick. This estimator is devoid of summary statistics and obviates some of the difficulties with ABC model choice. We establish sufficient conditions for consistency of our Deep Bayes Factor estimator as well as its consistency as a model selection tool. We investigate the performance of our estimator on various examples using a wide range of quality metrics related to estimation and model decision accuracy. After training, our deep learning approach enables rapid evaluations of the Bayes factor estimator at any fictional data arriving from either hypothesized model, not just the observed data $Y_0$. This allows us to inspect entire Bayes factor distributions under the two models and to quantify the relative location of the Bayes factor evaluated at $Y_0$ in light of these distributions. Such tail area evaluations are not possible for Bayes factor estimators tailored to $Y_0$. We find the performance of our Deep Bayes Factors competitive with existing MCMC techniques that require the knowledge of the likelihood function. We also consider variants for posterior or intrinsic Bayes factors estimation. We demonstrate the usefulness of our approach on a relatively high-dimensional real data example about determining cognitive biases.

artificial intelligence, bayesian inference, machine learning, (19 more...)

2312.05411

Country:

North America > United States > Illinois > Cook County > Chicago (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > Netherlands > South Holland > Rotterdam (0.04)

Genre: Research Report > New Finding (0.46)

Industry: Health & Medicine (0.67)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.93)
(2 more...)

Reiser, Philipp, Aguilar, Javier Enrique, Guthke, Anneli, Bürkner, Paul-Christian

Uncertainty Quantification and Propagation in Surrogate-based Bayesian Inference

Simulations of complex phenomena are crucial in the natural sciences and engineering for different scenarios, e.g., for gaining system understanding, prediction of future scenarios, risk assessment, or system design. However, often they are based on complex ordinary differential equations or partial differential equations which may not have closed-form solutions and may have to be solved using expensive numerical methods. To overcome computational overhead, the field of surrogate models (Zhu and Zabaras, 2018; Gramacy, 2020; Lavin et al., 2021) has emerged which provide fast approximations of computationally expensive simulation. Examples are polynomial chaos expansion (Wiener, 1938; Sudret, 2008; Oladyshkin and Nowak, 2012; Bürkner et al., 2023), Gaussian processes (Rasmussen and Williams, 2005) or neural networks (Goodfellow et al., 2016). Recently there has been a great interest in applying surrogate models in relevant areas, for example in hydrology (Mohammadi et al., 2018; Tarakanov and Elsheikh, 2019; Zhang et al., 2020), in fluid dynamics (Meyer et al., 2021), in climate prediction (Kuehnert et al., 2022), or in systems biology (Renardy et al., 2018; Alden et al., 2020).

artificial intelligence, machine learning, modeling & simulation, (17 more...)

2312.05153

Country:

Europe > Germany > Baden-Württemberg > Stuttgart Region > Stuttgart (0.04)
North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
North America > United States > New York > New York County > New York City (0.04)
(5 more...)

Genre: Research Report (1.00)

Technology:

Information Technology > Modeling & Simulation (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
(2 more...)

Bayesian data fusion with shared priors

Wu, Peng, Imbiriba, Tales, Elvira, Victor, Closas, Pau

The integration of data and knowledge from several sources is known as data fusion. When data is only available in a distributed fashion or when different sensors are used to infer a quantity of interest, data fusion becomes essential. In Bayesian settings, a priori information of the unknown quantities is available and, possibly, present among the different distributed estimators. When the local estimates are fused, the prior knowledge used to construct several local posteriors might be overused unless the fusion node accounts for this and corrects it. In this paper, we analyze the effects of shared priors in Bayesian data fusion contexts. Depending on different common fusion rules, our analysis helps to understand the performance behavior as a function of the number of collaborative agents and as a consequence of different types of priors. The analysis is performed by using two divergences which are common in Bayesian inference, and the generality of the results allows to analyze very generic distributions. These theoretical results are corroborated through experiments in a variety of estimation and classification problems, including linear and nonlinear models, and federated learning schemes.

artificial intelligence, machine learning, posterior, (19 more...)

2212.07311

Country: North America > United States > Massachusetts > Suffolk County > Boston (0.04)

Genre: Research Report > New Finding (0.67)

Technology:

Information Technology > Data Science > Data Integration (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Information Fusion (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Wang, Lingda, Zhao, Zhizhen

Multi-Frequency Joint Community Detection and Phase Synchronization

This paper studies the joint community detection and phase synchronization problem on the \textit{stochastic block model with relative phase}, where each node is associated with an unknown phase angle. This problem, with a variety of real-world applications, aims to recover the cluster structure and associated phase angles simultaneously. We show this problem exhibits a \textit{``multi-frequency''} structure by closely examining its maximum likelihood estimation (MLE) formulation, whereas existing methods are not originated from this perspective. To this end, two simple yet efficient algorithms that leverage the MLE formulation and benefit from the information across multiple frequencies are proposed. The former is a spectral method based on the novel multi-frequency column-pivoted QR factorization. The factorization applied to the top eigenvectors of the observation matrix provides key information about the cluster structure and associated phase angles. The second approach is an iterative multi-frequency generalized power method, where each iteration updates the estimation in a matrix-multiplication-then-projection manner. Numerical experiments show that our proposed algorithms significantly improve the ability of exactly recovering the cluster structure and the accuracy of the estimated phase angles, compared to state-of-the-art algorithms.

artificial intelligence, bayesian inference, machine learning, (17 more...)

doi: 10.1109/TSIPN.2023.3258062

2206.12276

Country:

North America > United States > Illinois (0.04)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.54)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.54)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.46)

arXiv.org Machine LearningDec-7-2023

Enhancing Polynomial Chaos Expansion Based Surrogate Modeling using a Novel Probabilistic Transfer Learning Strategy

Bridgman, Wyatt, Balakrishnan, Uma, Jones, Reese, Chen, Jiefu, Wu, Xuqing, Safta, Cosmin, Huang, Yueqin, Khalil, Mohammad

In the field of surrogate modeling, polynomial chaos expansion (PCE) allows practitioners to construct inexpensive yet accurate surrogates to be used in place of the expensive forward model simulations. For black-box simulations, non-intrusive PCE allows the construction of these surrogates using a set of simulation response evaluations. In this context, the PCE coefficients can be obtained using linear regression, which is also known as point collocation or stochastic response surfaces. Regression exhibits better scalability and can handle noisy function evaluations in contrast to other non-intrusive approaches, such as projection. However, since over-sampling is generally advisable for the linear regression approach, the simulation requirements become prohibitive for expensive forward models. We propose to leverage transfer learning whereby knowledge gained through similar PCE surrogate construction tasks (source domains) is transferred to a new surrogate-construction task (target domain) which has a limited number of forward model simulations (training data). The proposed transfer learning strategy determines how much, if any, information to transfer using new techniques inspired by Bayesian modeling and data assimilation. The strategy is scrutinized using numerical investigations and applied to an engineering problem from the oil and gas industry.

artificial intelligence, machine learning, objective function, (18 more...)

2312.04648

Country:

Europe (0.28)
North America > United States > California (0.14)
North America > United States > Texas (0.14)

Genre: Research Report > New Finding (0.46)

Industry:

Government > Regional Government > North America Government > United States Government (0.94)
Energy > Oil & Gas > Upstream (0.93)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Transfer Learning (0.96)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.74)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.66)

Forristal, Jarad, Mireshghallah, Niloofar, Durrett, Greg, Berg-Kirkpatrick, Taylor

A Block Metropolis-Hastings Sampler for Controllable Energy-based Text Generation

arXiv.org Artificial IntelligenceDec-7-2023

Recent work has shown that energy-based language modeling is an effective framework for controllable text generation because it enables flexible integration of arbitrary discriminators. However, because energy-based LMs are globally normalized, approximate techniques like Metropolis-Hastings (MH) are required for inference. Past work has largely explored simple proposal distributions that modify a single token at a time, like in Gibbs sampling. In this paper, we develop a novel MH sampler that, in contrast, proposes re-writes of the entire sequence in each step via iterative prompting of a large language model. Our new sampler (a) allows for more efficient and accurate sampling from a target distribution and (b) allows generation length to be determined through the sampling procedure rather than fixed in advance, as past work has required. We perform experiments on two controlled generation tasks, showing both downstream performance gains and more accurate target distribution sampling in comparison with single-token proposal techniques.

computational linguistic, language model, proceedings, (15 more...)

2312.0451

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Asia > Middle East > UAE > Abu Dhabi Emirate > Abu Dhabi (0.04)
North America > United States > Texas > Travis County > Austin (0.04)
(7 more...)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.90)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.34)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.34)

Paritosh, Parth, Atanasov, Nikolay, Martinez, Sonia

Distributed Bayesian Estimation in Sensor Networks: Consensus on Marginal Densities

arXiv.org Artificial IntelligenceDec-7-2023

In this paper, we aim to design and analyze distributed Bayesian estimation algorithms for sensor networks. The challenges we address are to (i) derive a distributed provably-correct algorithm in the functional space of probability distributions over continuous variables, and (ii) leverage these results to obtain new distributed estimators restricted to subsets of variables observed by individual agents. This relates to applications such as cooperative localization and federated learning, where the data collected at any agent depends on a subset of all variables of interest. We present Bayesian density estimation algorithms using data from non-linear likelihoods at agents in centralized, distributed, and marginal distributed settings. After setting up a distributed estimation objective, we prove almost-sure convergence to the optimal set of pdfs at each agent. Then, we prove the same for a storage-aware algorithm estimating densities only over relevant variables at each agent. Finally, we present a Gaussian version of these algorithms and implement it in a mapping problem using variational inference to handle non-linear likelihood models associated with LiDAR sensing.

algorithm, convergence, sequence, (15 more...)

2312.01227

Country:

North America > United States > New York (0.04)
North America > United States > California > San Diego County > San Diego (0.04)
North America > United States > California > San Diego County > La Jolla (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.84)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.84)

Jerdee, Maximilian, Newman, M. E. J.

Luck, skill, and depth of competition in games and social hierarchies

arXiv.org Machine LearningDec-7-2023

Patterns of wins and losses in pairwise contests, such as occur in sports and games, consumer research and paired comparison studies, and human and animal social hierarchies, are commonly analyzed using probabilistic models that allow one to quantify the strength of competitors or predict the outcome of future contests. Here we generalize this approach to incorporate two additional features: an element of randomness or luck that leads to upset wins, and a "depth of competition" variable that measures the complexity of a game or hierarchy. Fitting the resulting model to a large collection of data sets we estimate depth and luck in a range of games, sports, and social situations. In general, we find that social competition tends to be "deep," meaning it has a pronounced hierarchy with many distinct levels, but also that there is often a nonzero chance of an upset victory, meaning that dominance challenges can be won even by significant underdogs. Competition in sports and games, by contrast, tends to be shallow and in most cases there is little evidence of upset wins, beyond those already implied by the shallowness of the hierarchy.

artificial intelligence, hierarchy, machine learning, (20 more...)

2312.04711

Country:

North America > United States > Michigan > Washtenaw County > Ann Arbor (0.14)
North America > United States > New York (0.04)
Europe > United Kingdom > England (0.04)
North America > United States > North Carolina (0.04)

Genre: Research Report > New Finding (0.46)

Industry:

Leisure & Entertainment > Games > Chess (0.69)
Health & Medicine (0.67)
Leisure & Entertainment > Sports > Soccer (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.47)