AITopics

2310.04578

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
North America > Canada > Quebec > Montreal (0.04)
North America > Canada > Ontario (0.04)
(3 more...)

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.93)

Industry:

Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)
Health & Medicine > Therapeutic Area > Immunology (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.46)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Makinen, T. Lucas, Alsing, Justin, Wandelt, Benjamin D.

Fishnets: Information-Optimal, Scalable Aggregation for Sets and Graphs

Set-based learning is an essential component of modern deep learning and network science. Graph Neural Networks (GNNs) and their edge-free counterparts Deepsets have proven remarkably useful on ragged and topologically challenging datasets. The key to learning informative embeddings for set members is a specified aggregation function, usually a sum, max, or mean. We propose Fishnets, an aggregation strategy for learning information-optimal embeddings for sets of data for both Bayesian inference and graph aggregation. We demonstrate that i) Fishnets neural summaries can be scaled optimally to an arbitrary number of data objects, ii) Fishnets aggregations are robust to changes in data distribution, unlike standard deepsets, iii) Fishnets saturate Bayesian information content and extend to regimes where MCMC techniques fail and iv) Fishnets can be used as a drop-in aggregation scheme within GNNs. We show that by adopting a Fishnets aggregation scheme for message passing, GNNs can achieve state-of-the-art performance versus architecture size on ogbn-protein data over existing benchmarks with a fraction of learnable parameters and faster training time.

aggregation, artificial intelligence, machine learning, (18 more...)

2310.03812

Country:

Europe > Sweden > Stockholm > Stockholm (0.04)
North America > United States > New York > New York County > New York City (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > France > Île-de-France > Paris > Paris (0.04)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.91)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.67)

Renaud, Marien, Liu, Jiaming, de Bortoli, Valentin, Almansa, Andrés, Kamilov, Ulugbek S.

Plug-and-Play Posterior Sampling under Mismatched Measurement and Prior Models

Many imaging problems can be formulated as inverse problems seeking to recover high-quality images from their low-quality observations. Such problems arise across the fields of biomedical imaging (McCann et al., 2017a), computer vision (Pizlo, 2001), and computational imaging (Ongie et al., 2020). Since imaging inverse problems are generally ill-posed, it is common to apply prior models on the desired images. There has been significant progress in developing Deep Learning (DL) based image priors, where a deep model is trained to directly map degraded observations to images (McCann et al., 2017b; Jin et al., 2017; Li et al., 2020). Model-based DL (MBDL) is an alternative to traditional DL that explicitly uses knowledge of the forward model by integrating DL denoisers as implicit priors into model-based optimization algorithms (Venkatakrishnan et al., 2013; Romano et al., 2017). It has been generally observed that learned denoisers are essential for achieving the state-of-the-art results in many imaging contexts (Metzler et al., 2018; Ulondu-Mendes et al., 2023; Ryu et al., 2019; Hurault et al., 2022; Wu et al., 2020). However, most prior work in the area has focused on methods that can only produce point estimates without any quantification of the reconstruction uncertainty (Belhasin et al., 2023), which can be essential in critical applications such as healthcare or security (Liu et al., 2023). In recent years, the exploration of strategies for sampling from the posterior probability has emerged as a focal point in the field of inverse problem in imaging (Pereyra et al., 2015; Bouman & Buzzard, 2023; Chung et al., 2023; Song et al., 2022). This pursuit has given rise to a plethora of techniques, encompassing wellestablished methods such as Gibbs sampling (Coeurdoux et al., 2023), the Unadjusted Langevin Algorithm

artificial intelligence, bayesian inference, machine learning, (21 more...)

2310.03546

Country:

North America > United States (0.04)
Europe > France (0.04)
Europe > Sweden > Stockholm > Stockholm (0.04)

Genre: Research Report > New Finding (0.68)

Industry: Health & Medicine > Diagnostic Medicine > Imaging (0.66)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.88)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.87)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.66)

Magris, Martin, Iosifidis, Alexandros

Variational Inference for GARCH-family Models

The Bayesian estimation of GARCH-family models has been typically addressed through Monte Carlo sampling. Variational Inference is gaining popularity and attention as a robust approach for Bayesian inference in complex machine learning models; however, its adoption in econometrics and finance is limited. This paper discusses the extent to which Variational Inference constitutes a reliable and feasible alternative to Monte Carlo sampling for Bayesian inference in GARCH-like models. Through a large-scale experiment involving the constituents of the S&P 500 index, several Variational Inference optimizers, a variety of volatility models, and a case study, we show that Variational Inference is an attractive, remarkably well-calibrated, and competitive method for Bayesian learning.

artificial intelligence, bayesian inference, machine learning, (18 more...)

2310.03435

Country:

Europe > Iceland > Capital Region > Reykjavik (0.04)
Europe > Denmark > Central Jutland > Aarhus (0.04)

Genre: Research Report (1.00)

Industry: Banking & Finance > Trading (0.86)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Maximum Likelihood Estimation of Latent Variable Structural Equation Models: A Neural Network Approach

Saremi, Mehrzad

artificial intelligence, latent variable structural equation model, machine learning, (3 more...)

We propose a graphical structure for structural equation models that is stable under marginalization under linearity and Gaussianity assumptions. We show that computing the maximum likelihood estimation of this model is equivalent to training a neural network. We implement a GPU-based algorithm that computes the maximum likelihood estimation of these models.

2309.14073

Genre: Research Report (0.69)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Hinrich, Jesper Løve, Mørup, Morten

Probabilistic Block Term Decomposition for the Modelling of Higher-Order Arrays

arXiv.org Machine LearningOct-4-2023

Tensors or multi-way arrays naturally occur in practically all areas of science including psychology (i.e., human responses to questionnaire data according to scoring criteria of different objects), chemometrics (i.e., excitation and emission spectra across samples), biology (i.e., genetic expression of cell proles across time and experimental conditions), and knowledge representations (i.e., entity-entity relationships across predicates), see also [1] and references therein. To analyze these multi-way arrays accounting for their higher order structure tensor decompositions have become important tools to characterize and discover structure in these data, see [2, 1] for details. Tensor decompositions have historically focused on maximum likelihood estimation methods to obtain a point estimate to decompose the data, most predominately based on Gaussian likelihood (least squares estimation). Recently, there has been a rise in the development of Bayesian inference for tensor data, initially focusing on binary or count data, but now applied more broadly to various types of data, for an overview see [3, 4]. The benets of a Bayesian approach are that it characterizes the decomposition solution as a distribution, the so-called posterior distribution, which allows characterization of the uncertainty whereas priors acts as regularizers adding robustness and preventing issues of degeneracy. Additionally, it provides a principled way to incorporate a priori information. For a review on maximum likelihood based and Bayesian tensor decomposition, see [2] and [3], respectively. The two most common tensor decomposition methods are the Canonical Polyadic Decomposition/PARAFAC (CPD) and Tucker model. The CPD model represents the data through a sum of outer product rank-1 terms (i.e., separate multi-linear structures), whereas Tucker uses a multi-linear rank decomposition (i.e., with "connected" multi-linear structures).

artificial intelligence, bayesian inference, machine learning, (16 more...)

2310.02694

Country:

Europe > Denmark > Capital Region > Copenhagen (0.04)
Africa > Senegal > Kolda Region > Kolda (0.04)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Núñez-Molina, Carlos, Mesejo, Pablo, Fernández-Olivares, Juan

Towards a Unified Framework for Sequential Decision Making

In recent years, the integration of Automated Planning (AP) and Reinforcement Learning (RL) has seen a surge of interest. To perform this integration, a general framework for Sequential Decision Making (SDM) would prove immensely useful, as it would help us understand how AP and RL fit together. In this preliminary work, we attempt to provide such a framework, suitable for any method ranging from Classical Planning to Deep RL, by drawing on concepts from Probability Theory and Bayesian inference. We formulate an SDM task as a set of training and test Markov Decision Processes (MDPs), to account for generalization. We provide a general algorithm for SDM which we hypothesize every SDM method is based on. According to it, every SDM algorithm can be seen as a procedure that iteratively improves its solution estimate by leveraging the task knowledge available. Finally, we derive a set of formulas and algorithms for calculating interesting properties of SDM tasks and methods, which make possible their empirical evaluation and comparison.

algorithm, knowledge, sdm algorithm, (15 more...)

2310.02167

Country: Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.48)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.34)

Fill in the Blank: Exploring and Enhancing LLM Capabilities for Backward Reasoning in Math Word Problems

Deb, Aniruddha, Oza, Neeva, Singla, Sarthak, Khandelwal, Dinesh, Garg, Dinesh, Singla, Parag

While forward reasoning (i.e. find the answer given the question) has been explored extensively in the recent literature, backward reasoning is relatively unexplored. We examine the backward reasoning capabilities of LLMs on Math Word Problems (MWPs): given a mathematical question and its answer, with some details omitted from the question, can LLMs effectively retrieve the missing information? In this paper, we formally define the backward reasoning task on math word problems and modify three datasets to evaluate this task: GSM8k, SVAMP and MultiArith. Our findings show a significant drop in the accuracy of models on backward reasoning compared to forward reasoning across four SOTA LLMs (GPT4, GPT3.5, PaLM-2, and LLaMa-2). Utilizing the specific format of this task, we propose three novel techniques that improve performance: Rephrase reformulates the given problem into a forward reasoning problem, PAL-Tools combines the idea of Program-Aided LLMs to produce a set of equations that can be solved by an external solver, and Check your Work exploits the availability of natural verifier of high accuracy in the forward direction, interleaving solving and verification steps. Finally, realizing that each of our base methods correctly solves a different set of problems, we propose a novel Bayesian formulation for creating an ensemble over these base methods aided by a verifier to further boost the accuracy by a significant margin. Extensive experimentation demonstrates that our techniques successively improve the performance of LLMs on the backward reasoning task, with the final ensemble-based method resulting in a substantial performance gain compared to the raw LLMs with standard prompting techniques such as chain-of-thought.

accuracy, equation, reasoning, (16 more...)

2310.01991

Genre: Research Report > New Finding (0.86)

Industry: Leisure & Entertainment (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.68)

Wicker, Matthew, Laurenti, Luca, Patane, Andrea, Paoletti, Nicola, Abate, Alessandro, Kwiatkowska, Marta

Probabilistic Reach-Avoid for Bayesian Neural Networks

Model-based reinforcement learning seeks to simultaneously learn the dynamics of an unknown stochastic environment and synthesise an optimal policy for acting in it. Ensuring the safety and robustness of sequential decisions made through a policy in such an environment is a key challenge for policies intended for safety-critical scenarios. In this work, we investigate two complementary problems: first, computing reach-avoid probabilities for iterative predictions made with dynamical models, with dynamics described by Bayesian neural network (BNN); second, synthesising control policies that are optimal with respect to a given reach-avoid specification (reaching a "target" state, while avoiding a set of "unsafe" states) and a learned BNN model. Our solution leverages interval propagation and backward recursion techniques to compute lower bounds for the probability that a policy's sequence of actions leads to satisfying the reach-avoid specification. Such computed lower bounds provide safety certification for the given policy and BNN model. We then introduce control synthesis algorithms to derive policies maximizing said lower bounds on the safety probability. We demonstrate the effectiveness of our method on a series of control benchmarks characterized by learned BNN dynamics models. On our most challenging benchmark, compared to purely data-driven policies the optimal synthesis algorithm is able to provide more than a four-fold increase in the number of certifiable states and more than a three-fold increase in the average guaranteed reach-avoid probability.

international conference, neural network, probability, (17 more...)

2310.01951

Country:

Europe > United Kingdom > England > Oxfordshire > Oxford (0.14)
Europe > Austria > Vienna (0.14)
Europe > Netherlands > South Holland > Delft (0.04)
(2 more...)

Genre:

Workflow (0.67)
Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)

Bayesian Personalized Federated Learning with Shared and Personalized Uncertainty Representations

Chen, Hui, Liu, Hengyu, Cao, Longbing, Zhang, Tiancheng

Bayesian personalized federated learning (BPFL) addresses challenges in existing personalized FL (PFL). BPFL aims to quantify the uncertainty and heterogeneity within and across clients towards uncertainty representations by addressing the statistical heterogeneity of client data. In PFL, some recent preliminary work proposes to decompose hidden neural representations into shared and local components and demonstrates interesting results. However, most of them do not address client uncertainty and heterogeneity in FL systems, while appropriately decoupling neural representations is challenging and often ad hoc. In this paper, we make the first attempt to introduce a general BPFL framework to decompose and jointly learn shared and personalized uncertainty representations on statistically heterogeneous client data over time. A Bayesian federated neural network BPFed instantiates BPFL by jointly learning cross-client shared uncertainty and client-specific personalized uncertainty over statistically heterogeneous and randomly participating clients. We further involve continual updating of prior distribution in BPFed to speed up the convergence and avoid catastrophic forgetting. Theoretical analysis and guarantees are provided in addition to the experimental evaluation of BPFed against the diversified baselines.

bpfed, federated learning, representation, (15 more...)

2309.15499

Country:

North America > Canada > Ontario > Toronto (0.14)
Oceania > Australia > New South Wales > Sydney (0.04)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
Asia > China > Beijing > Beijing (0.04)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.68)