AITopics

2305.11278

Country:

Asia > Middle East > Jordan (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
North America > United States > New York > Suffolk County > Stony Brook (0.04)
Europe > Portugal (0.04)

Genre: Research Report (1.00)

Industry: Health & Medicine > Therapeutic Area > Neurology (0.86)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Heap, Thomas, Leech, Gavin, Aitchison, Laurence

Massively Parallel Reweighted Wake-Sleep

arXiv.org Artificial IntelligenceMay-18-2023

Reweighted wake-sleep (RWS) is a machine learning method for performing Bayesian inference in a very general class of models. RWS draws $K$ samples from an underlying approximate posterior, then uses importance weighting to provide a better estimate of the true posterior. RWS then updates its approximate posterior towards the importance-weighted estimate of the true posterior. However, recent work [Chattergee and Diaconis, 2018] indicates that the number of samples required for effective importance weighting is exponential in the number of latent variables. Attaining such a large number of importance samples is intractable in all but the smallest models. Here, we develop massively parallel RWS, which circumvents this issue by drawing $K$ samples of all $n$ latent variables, and individually reasoning about all $K^n$ possible combinations of samples. While reasoning about $K^n$ combinations might seem intractable, the required computations can be performed in polynomial time by exploiting conditional independencies in the generative model. We show considerable improvements over standard "global" RWS, which draws $K$ samples from the full joint.

artificial intelligence, bayesian inference, machine learning, (17 more...)

2305.11022

Country:

Asia > Middle East > Jordan (0.04)
North America > United States > New York (0.04)
North America > United States > Minnesota (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report (0.50)

Industry: Education (0.67)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.48)

Dubiel-Teleszynski, Tomasz, Kalogeropoulos, Konstantinos, Karouzakis, Nikolaos

Dynamic Term Structure Models with Nonlinearities using Gaussian Processes

arXiv.org Machine LearningMay-18-2023

The importance of unspanned macroeconomic variables for Dynamic Term Structure Models has been intensively discussed in the literature. To our best knowledge the earlier studies considered only linear interactions between the economy and the real-world dynamics of interest rates in DTSMs. We propose a generalized modelling setup for Gaussian DTSMs which allows for unspanned nonlinear associations between the two and we exploit it in forecasting. Specifically, we construct a custom sequential Monte Carlo estimation and forecasting scheme where we introduce Gaussian Process priors to model nonlinearities. Sequential scheme we propose can also be used with dynamic portfolio optimization to assess the potential of generated economic value to investors. The methodology is presented using US Treasury data and selected macroeconomic indices. Namely, we look at core inflation and real economic activity. We contrast the results obtained from the nonlinear model with those stemming from an application of a linear model. Unlike for real economic activity, in case of core inflation we find that, compared to linear models, application of nonlinear models leads to statistically significant gains in economic value across considered maturities.

artificial intelligence, machine learning, modeling & simulation, (15 more...)

arXiv.org Machine Learning

2305.11001

Country:

North America > United States > Illinois > Cook County > Chicago (0.04)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
Europe > United Kingdom > England > East Sussex > Brighton (0.04)
Europe > Greece > Attica > Athens (0.04)

Genre: Research Report (1.00)

Industry: Banking & Finance > Economy (0.86)

Technology:

Information Technology > Modeling & Simulation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.92)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.67)

Tran, Minh-Ngoc, Tseng, Paco, Kohn, Robert

Particle Mean Field Variational Bayes

To solve this problem, there are two main classes of computational methods that provide different approaches to approximate π. The first one is Markov chain Monte Carlo (MCMC) methods (Metropolis et al., 1953; Hastings, 1970; Robert and Casella, 1999). For many years, MCMC has been the standard approach for Bayesian analysis because of its theoretical soundness. The method constructs a Markov chain to produce simulation consistent samples from the target distribution π. A general MCMC approach is the Metropolis-Hastings algorithm that generates a Markov chain by first generating a proposed state from a proposal distribution, then using an acceptance rule to decide whether to accept the proposal or stay at the current state (Robert and Casella, 1999).

artificial intelligence, machine learning, particle, (16 more...)

2303.1393

Genre: Research Report (0.51)

Industry:

Government (0.67)
Energy > Oil & Gas > Upstream (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.74)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.68)
(2 more...)

Augmented Message Passing Stein Variational Gradient Descent

Zhou, Jiankui, Qiu, Yue

Stein Variational Gradient Descent (SVGD) is a popular particle-based method for Bayesian inference. However, its convergence suffers from the variance collapse, which reduces the accuracy and diversity of the estimation. In this paper, we study the isotropy property of finite particles during the convergence process and show that SVGD of finite particles cannot spread across the entire sample space. Instead, all particles tend to cluster around the particle center within a certain range and we provide an analytical bound for this cluster. To further improve the effectiveness of SVGD for high-dimensional problems, we propose the Augmented Message Passing SVGD (AUMP-SVGD) method, which is a two-stage optimization procedure that does not require sparsity of the target distribution, unlike the MP-SVGD method. Our algorithm achieves satisfactory accuracy and overcomes the variance collapse problem in various benchmark problems.

artificial intelligence, machine learning, particle, (15 more...)

2305.10636

Genre: Research Report (0.82)

Industry: Energy > Oil & Gas > Upstream (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.61)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.48)

Salmani, Bahare, Katoen, Joost-Pieter

Finding an $\epsilon$-close Variation of Parameters in Bayesian Networks

This paper addresses the $\epsilon$-close parameter tuning problem for Bayesian Networks (BNs): find a minimal $\epsilon$-close amendment of probability entries in a given set of (rows in) conditional probability tables that make a given quantitative constraint on the BN valid. Based on the state-of-the-art "region verification" techniques for parametric Markov chains, we propose an algorithm whose capabilities go beyond any existing techniques. Our experiments show that $\epsilon$-close tuning of large BN benchmarks with up to 8 parameters is feasible. In particular, by allowing (i) varied parameters in multiple CPTs and (ii) inter-CPT parameter dependencies, we treat subclasses of parametric BNs that have received scant attention so far.

artificial intelligence, constraint, machine learning, (19 more...)

2305.10051

Country:

North America > United States > California > Los Angeles County > Los Angeles (0.14)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report (0.82)

Industry:

Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)
Health & Medicine > Therapeutic Area > Immunology (0.69)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Xu, Weijia, Banburski-Fahey, Andrzej, Jojic, Nebojsa

Reprompting: Automated Chain-of-Thought Prompt Inference Through Gibbs Sampling

We introduce Reprompting, an iterative sampling algorithm that searches for the Chain-of-Thought (CoT) recipes for a given task without human intervention. Through Gibbs sampling, we infer CoT recipes that work consistently well for a set of training samples. Our method iteratively samples new recipes using previously sampled solutions as parent prompts to solve other training problems. On five Big-Bench Hard tasks that require multi-step reasoning, Reprompting achieves consistently better performance than the zero-shot, few-shot, and human-written CoT baselines. Reprompting can also facilitate transfer of knowledge from a stronger model to a weaker model leading to substantially improved performance of the weaker model. Overall, Reprompting brings up to +17 point improvements over the previous state-of-the-art method that uses human-written CoT prompts.

large language model, machine learning, natural language, (20 more...)

2305.09993

Country:

North America > United States > New York > New York County > New York City (0.04)
Europe > Italy > Calabria > Catanzaro Province > Catanzaro (0.04)
Europe > Ireland > Leinster > County Dublin > Dublin (0.04)

Genre: Research Report > New Finding (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.75)
(2 more...)

Leveraging Demonstrations to Improve Online Learning: Quality Matters

Hao, Botao, Jain, Rahul, Lattimore, Tor, Van Roy, Benjamin, Wen, Zheng

We investigate the extent to which offline demonstration data can improve online learning. It is natural to expect some improvement, but the question is how, and by how much? We show that the degree of improvement must depend on the quality of the demonstration data. To generate portable insights, we focus on Thompson sampling (TS) applied to a multi-armed bandit as a prototypical online learning algorithm and model. The demonstration data is generated by an expert with a given competence level, a notion we introduce. We propose an informed TS algorithm that utilizes the demonstration data in a coherent way through Bayes' rule and derive a prior-dependent Bayesian regret bound. This offers insight into how pretraining can greatly improve online performance and how the degree of improvement increases with the expert's competence level. We also develop a practical, approximate informed TS algorithm through Bayesian bootstrapping and show substantial empirical regret reduction through experiments.

artificial intelligence, bayesian inference, machine learning, (15 more...)

2302.03319

Country:

North America > United States > California (0.14)
North America > United States > Hawaii > Honolulu County > Honolulu (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report (0.40)

Industry: Education > Educational Setting > Online (0.84)

Technology:

Information Technology > Enterprise Applications > Human Resources > Learning Management (0.84)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.48)

Errica, Federico, Niepert, Mathias

Tractable Probabilistic Graph Representation Learning with Graph-Induced Sum-Product Networks

We introduce Graph-Induced Sum-Product Networks (GSPNs), a new probabilistic framework for graph representation learning that can tractably answer probabilistic queries. Inspired by the computational trees induced by vertices in the context of message-passing neural networks, we build hierarchies of sum-product networks (SPNs) where the parameters of a parent SPN are learnable transformations of the a-posterior mixing probabilities of its children's sum units. Due to weight sharing and the tree-shaped computation graphs of GSPNs, we obtain the efficiency and efficacy of deep graph networks with the additional advantages of a purely probabilistic model. We show the model's competitiveness on scarce supervision scenarios, handling missing data, and graph classification in comparison to popular neural models. We complement the experiments with qualitative analyses on hyper-parameters and the model's ability to answer probabilistic queries.

artificial intelligence, information management, machine learning, (18 more...)

2305.10544

Country:

Europe > Germany > Baden-Württemberg > Stuttgart Region > Stuttgart (0.04)
Asia > Middle East > Jordan (0.04)
North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
(4 more...)

Genre: Research Report (1.00)

Industry:

Materials > Chemicals (0.47)
Information Technology > Security & Privacy (0.46)

Technology:

Information Technology > Information Management (1.00)
Information Technology > Data Science (1.00)
Information Technology > Communications (1.00)
(4 more...)

Generating Bayesian Network Models from Data Using Tsetlin Machines

Blakely, Christian D.

Bayesian networks (BN) are directed acyclic graphical (DAG) models that have been adopted into many fields for their strengths in transparency, interpretability, probabilistic reasoning, and causal modeling. Given a set of data, one hurdle towards using BNs is in building the network graph from the data that properly handles dependencies, whether correlated or causal. In this paper, we propose an initial methodology for discovering network structures using Tsetlin Machines.

artificial intelligence, machine learning, node, (20 more...)

2305.10538

Country:

Europe > Switzerland > Zürich > Zürich (0.14)
Asia (0.04)

Genre:

Research Report (0.64)
Workflow (0.46)

Industry: Banking & Finance (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)