AITopics

2406.19524

Country:

North America > United States > Illinois > Cook County > Chicago (0.25)
Europe > Austria > Vienna (0.14)
North America > United States > California > Alameda County > Livermore (0.04)
(2 more...)

Genre: Research Report (1.00)

Industry:

Health & Medicine > Epidemiology (1.00)
Government > Regional Government > North America Government > United States Government (1.00)
Health & Medicine > Therapeutic Area (0.88)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Bhargav, Jayanth, Ghasemi, Mahsa, Sundaram, Shreyas

Submodular Information Selection for Hypothesis Testing with Misclassification Penalties

arXiv.org Machine LearningJun-27-2024

We consider the problem of selecting an optimal subset of information sources for a hypothesis testing/classification task where the goal is to identify the true state of the world from a finite set of hypotheses, based on finite observation samples from the sources. In order to characterize the learning performance, we propose a misclassification penalty framework, which enables nonuniform treatment of different misclassification errors. In a centralized Bayesian learning setting, we study two variants of the subset selection problem: (i) selecting a minimum cost information set to ensure that the maximum penalty of misclassifying the true hypothesis is below a desired bound and (ii) selecting an optimal information set under a limited budget to minimize the maximum penalty of misclassifying the true hypothesis. Under certain assumptions, we prove that the objective (or constraints) of these combinatorial optimization problems are weak (or approximate) submodular, and establish high-probability performance guarantees for greedy algorithms. Further, we propose an alternate metric for information set selection which is based on the total penalty of misclassification. We prove that this metric is submodular and establish near-optimal guarantees for the greedy algorithms for both the information set selection problems. Finally, we present numerical simulations to validate our theoretical results over several randomly generated instances.

information, information source, penalty, (15 more...)

2405.1093

Country: North America > United States (0.28)

Genre: Research Report (0.64)

Industry:

Transportation (1.00)
Government (0.93)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.88)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.76)

Yuan, Wei, Wang, Guanyang

Markov chain Monte Carlo without evaluating the target: an auxiliary variable approach

arXiv.org Machine LearningJun-27-2024

In sampling tasks, it is common for target distributions to be known up to a normalising constant. However, in many situations, evaluating even the unnormalised distribution can be costly or infeasible. This issue arises in scenarios such as sampling from the Bayesian posterior for tall datasets and the 'doubly-intractable' distributions. In this paper, we begin by observing that seemingly different Markov chain Monte Carlo (MCMC) algorithms, such as the exchange algorithm, PoissonMH, and TunaMH, can be unified under a simple common procedure. We then extend this procedure into a novel framework that allows the use of auxiliary variables in both the proposal and acceptance-rejection steps. We develop the theory of the new framework, applying it to existing algorithms to simplify and extend their results. Several new algorithms emerge from this framework, with improved performance demonstrated on both synthetic and real datasets.

acceptance rate, algorithm, poissonmh, (16 more...)

2406.05242

Country:

North America > United States > Florida > Hillsborough County > University (0.04)
North America > United States > California > Orange County > Irvine (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report > New Finding (0.93)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.61)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.45)

Popović, Mladen, Dhali, Maruf A., Schomaker, Lambert, van der Plicht, Johannes, Rasmussen, Kaare Lund, La Nasa, Jacopo, Degano, Ilaria, Colombini, Maria Perla, Tigchelaar, Eibert

Dating ancient manuscripts using radiocarbon and AI-based writing style analysis

arXiv.org Artificial IntelligenceJun-26-2024

Determining the chronology of ancient handwritten manuscripts is essential for reconstructing the evolution of ideas. For the Dead Sea Scrolls, this is particularly important. However, there is an almost complete lack of date-bearing manuscripts evenly distributed across the timeline and written in similar scripts available for palaeographic comparison. Here, we present Enoch, a state-of-the-art AI-based date-prediction model, trained on the basis of new radiocarbon-dated samples of the scrolls. Enoch uses established handwriting-style descriptors and applies Bayesian ridge regression. The challenge of this study is that the number of radiocarbon-dated manuscripts is small, while current machine learning requires an abundance of training data. We show that by using combined angular and allographic writing style feature vectors and applying Bayesian ridge regression, Enoch could predict the radiocarbon-based dates from style, supported by leave-one-out validation, with varied MAEs of 27.9 to 30.7 years relative to the radiocarbon dating. Enoch was then used to estimate the dates of 135 unseen manuscripts, revealing that 79 per cent of the samples were considered 'realistic' upon palaeographic post-hoc evaluation. We present a new chronology of the scrolls. The radiocarbon ranges and Enoch's style-based predictions are often older than the traditionally assumed palaeographic estimates. In the range of 300-50 BCE, Enoch's date prediction provides an improved granularity. The study is in line with current developments in multimodal machine-learning techniques, and the methods can be used for date prediction in other partially-dated manuscript collections. This research shows how Enoch's quantitative, probability-based approach can be a tool for palaeographers and historians, re-dating ancient Jewish key texts and contributing to current debates on Jewish and Christian origins.

artificial intelligence, bayesian inference, machine learning, (18 more...)

2407.12013

Country:

North America > United States (1.00)
Europe (1.00)
Asia > Middle East (0.67)

Genre: Research Report > New Finding (1.00)

Industry:

Materials > Chemicals > Commodity Chemicals > Petrochemicals (1.00)
Energy > Oil & Gas (0.92)
Health & Medicine > Diagnostic Medicine (0.92)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Zammit-Mangion, Andrew, Sainsbury-Dale, Matthew, Huser, Raphaël

Neural Methods for Amortised Inference

arXiv.org Machine LearningJun-26-2024

Simulation-based methods for statistical inference have evolved dramatically over the past 50 years, keeping pace with technological advancements. The field is undergoing a new revolution as it embraces the representational capacity of neural networks, optimisation libraries and graphics processing units for learning complex mappings between data and inferential targets. The resulting tools are amortised, in the sense that they allow rapid inference through fast feedforward operations. In this article we review recent progress in the context of point estimation, approximate Bayesian inference, summary-statistic construction, and likelihood approximation. We also cover software, and include a simple illustration to showcase the wide array of tools available for amortised inference and the benefits they offer over Markov chain Monte Carlo methods. The article concludes with an overview of relevant topics and an outlook on future research directions.

inference, neural network, posterior distribution, (15 more...)

2404.12484

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
Europe > Austria > Vienna (0.14)
North America > United States > Florida > Palm Beach County > Boca Raton (0.04)
(11 more...)

Genre:

Research Report (1.00)
Overview (1.00)

Industry: Health & Medicine (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
(2 more...)

Enhancing Explainability of Knowledge Learning Paths: Causal Knowledge Networks

Wei, Yuang, Zhou, Yizhou, Jiang, Yuan-Hao, Jiang, Bo

A reliable knowledge structure is a prerequisite for building effective adaptive learning systems and intelligent tutoring systems. Pursuing an explainable and trustworthy knowledge structure, we propose a method for constructing causal knowledge networks. This approach leverages Bayesian networks as a foundation and incorporates causal relationship analysis to derive a causal network. Additionally, we introduce a dependable knowledge-learning path recommendation technique built upon this framework, improving teaching and learning quality while maintaining transparency in the decision-making process.

causal relationship, experiment, knowledge component, (15 more...)

2406.17518

Country:

Asia > China > Shanghai > Shanghai (0.05)
Asia > Singapore (0.04)
North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
(2 more...)

Genre:

Research Report (1.00)
Instructional Material (1.00)

Industry:

Education > Educational Technology > Educational Software > Computer Based Training (1.00)
Education > Educational Setting (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Ontologies (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Towards Compositional Interpretability for XAI

Tull, Sean, Lorenz, Robin, Clark, Stephen, Khan, Ilyas, Coecke, Bob

Artificial intelligence (AI) is currently based largely on black-box machine learning models which lack interpretability. The field of eXplainable AI (XAI) strives to address this major concern, being critical in high-stakes areas such as the finance, legal and health sectors. We present an approach to defining AI models and their interpretability based on category theory. For this we employ the notion of a compositional model, which sees a model in terms of formal string diagrams which capture its abstract structure together with its concrete implementation. This comprehensive view incorporates deterministic, probabilistic and quantum models. We compare a wide range of AI models as compositional models, including linear and rule-based models, (recurrent) neural networks, transformers, VAEs, and causal and DisCoCirc models. Next we give a definition of interpretation of a model in terms of its compositional structure, demonstrating how to analyse the interpretability of a model, and using this to clarify common themes in XAI. We find that what makes the standard 'intrinsically interpretable' models so transparent is brought out most clearly diagrammatically. This leads us to the more general notion of compositionally-interpretable (CI) models, which additionally include, for instance, causal, conceptual space, and DisCoCirc models. We next demonstrate the explainability benefits of CI models. Firstly, their compositional structure may allow the computation of other quantities of interest, and may facilitate inference from the model to the modelled phenomenon by matching its structure. Secondly, they allow for diagrammatic explanations for their behaviour, based on influence constraints, diagram surgery and rewrite explanations. Finally, we discuss many future directions for the approach, raising the question of how to learn such meaningfully structured models in practice.

abstract interpretation, compositionality, intrinsically interpretable model, (13 more...)

2406.17583

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.13)
Europe > Switzerland > Zürich > Zürich (0.13)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
(8 more...)

Genre:

Overview (1.00)
Research Report (0.81)

Industry:

Health & Medicine (1.00)
Media > Film (0.45)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Explanation & Argumentation (0.87)
Information Technology > Artificial Intelligence > Issues > Social & Ethical Issues (0.87)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.66)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.46)

Sherkatghanad, Zeinab, Abdar, Moloud, Bakhtyari, Mohammadreza, Makarenkov, Vladimir

BayTTA: Uncertainty-aware medical image classification with optimized test-time augmentation using Bayesian model averaging

Test-time augmentation (TTA) is a well-known technique employed during the testing phase of computer vision tasks. It involves aggregating multiple augmented versions of input data. Combining predictions using a simple average formulation is a common and straightforward approach after performing TTA. This paper introduces a novel framework for optimizing TTA, called BayTTA (Bayesian-based TTA), which is based on Bayesian Model Averaging (BMA). First, we generate a model list associated with different variations of the input data created through TTA. Then, we use BMA to combine model predictions weighted by their respective posterior probabilities. Such an approach allows one to take into account model uncertainty, and thus to enhance the predictive performance of the related machine learning or deep learning model. We evaluate the performance of BayTTA on various public data, including three medical image datasets comprising skin cancer, breast cancer, and chest X-ray images and two well-known gene editing datasets, CRISPOR and GUIDE-seq. Our experimental results indicate that BayTTA can be effectively integrated into state-of-the-art deep learning models used in medical image analysis as well as into some popular pre-trained CNN models such as VGG-16, MobileNetV2, DenseNet201, ResNet152V2, and InceptionRes-NetV2, leading to the enhancement in their accuracy and robustness performance.

augmentation, dataset, prediction, (15 more...)

2406.1764

Country:

North America > Canada > Quebec > Montreal (0.04)
Oceania > Australia (0.04)
Europe > Spain > Andalusia > Granada Province > Granada (0.04)
Europe > France (0.04)

Genre:

Research Report > New Finding (0.94)
Research Report > Promising Solution (0.67)

Industry:

Health & Medicine > Therapeutic Area > Oncology (1.00)
Health & Medicine > Diagnostic Medicine > Imaging (1.00)
Health & Medicine > Therapeutic Area > Neurology (0.93)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
(2 more...)

Constructing structured tensor priors for Bayesian inverse problems

Batselier, Kim

Specifying a prior distribution is an essential part of solving Bayesian inverse problems. The prior encodes a belief on the nature of the solution and this regularizes the problem. In this article we completely characterize a Gaussian prior that encodes the belief that the solution is a structured tensor. We first define the notion of (A,b)-constrained tensors and show that they describe a large variety of different structures such as Hankel, circulant, triangular, symmetric, and so on. Then we completely characterize the Gaussian probability distribution of such tensors by specifying its mean vector and covariance matrix. Furthermore, explicit expressions are proved for the covariance matrix of tensors whose entries are invariant under a permutation. These results unlock a whole new class of priors for Bayesian inverse problems. We illustrate how new kernel functions can be designed and efficiently computed and apply our results on two particular Bayesian inverse problems: completing a Hankel matrix from a few noisy measurements and learning an image classifier of handwritten digits. The effectiveness of the proposed priors is demonstrated for both problems. All applications have been implemented as reactive Pluto notebooks in Julia.

matrix, permutation matrix, tensor, (16 more...)

2406.17597

Country:

Europe > Netherlands > South Holland > Delft (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
(2 more...)

Genre: Research Report (0.70)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.68)

Chen, Yifan, Huang, Daniel Zhengyu, Huang, Jiaoyang, Reich, Sebastian, Stuart, Andrew M.

Efficient, Multimodal, and Derivative-Free Bayesian Inference With Fisher-Rao Gradient Flows

In this paper, we study efficient approximate sampling for probability distributions known up to normalization constants. We specifically focus on a problem class arising in Bayesian inference for large-scale inverse problems in science and engineering applications. The computational challenges we address with the proposed methodology are: (i) the need for repeated evaluations of expensive forward models; (ii) the potential existence of multiple modes; and (iii) the fact that gradient of, or adjoint solver for, the forward model might not be feasible. While existing Bayesian inference methods meet some of these challenges individually, we propose a framework that tackles all three systematically. Our approach builds upon the Fisher-Rao gradient flow in probability space, yielding a dynamical system for probability densities that converges towards the target distribution at a uniform exponential rate. This rapid convergence is advantageous for the computational burden outlined in (i). We apply Gaussian mixture approximations with operator splitting techniques to simulate the flow numerically; the resulting approximation can capture multiple modes thus addressing (ii). Furthermore, we employ the Kalman methodology to facilitate a derivative-free update of these Gaussian components and their respective weights, addressing the issue in (iii). The proposed methodology results in an efficient derivative-free sampler flexible enough to handle multi-modal distributions: Gaussian Mixture Kalman Inversion (GMKI). The effectiveness of GMKI is demonstrated both theoretically and numerically in several experiments with multimodal target distributions, including proof-of-concept and two-dimensional examples, as well as a large-scale application: recovering the Navier-Stokes initial condition from solution data at positive times.

approximation, gmki, gradient flow, (14 more...)

2406.17263

Country:

North America > United States > Pennsylvania > Philadelphia County > Philadelphia (0.14)
Europe > Germany > Brandenburg > Potsdam (0.04)
Asia > Middle East > Jordan (0.04)
(4 more...)

Genre: Research Report (1.00)

Industry: Energy (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.92)