AITopics

2401.15771

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
North America > United States > Texas > Travis County > Austin (0.04)
North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.86)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.67)

Lemaire, Vincent, Boudec, Nathan Le, Guyomard, Victor, Fessant, Françoise

Viewing the process of generating counterfactuals as a source of knowledge

arXiv.org Artificial IntelligenceJan-30-2024

There are now many explainable AI methods for understanding the decisions of a machine learning model. Among these are those based on counterfactual reasoning, which involve simulating features changes and observing the impact on the prediction. This article proposes to view this simulation process as a source of creating a certain amount of knowledge that can be stored to be used, later, in different ways. This process is illustrated in the additive model and, more specifically, in the case of the naive Bayes classifier, whose interesting properties for this purpose are shown.

classifier, counterfactual, explanation, (13 more...)

2309.04284

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
Europe > France (0.05)

Genre: Research Report (0.40)

Industry: Telecommunications (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Explanation & Argumentation (0.90)

Schär, Philip, Habeck, Michael, Rudolf, Daniel

Parallel Affine Transformation Tuning of Markov Chain Monte Carlo

arXiv.org Machine LearningJan-29-2024

The performance of Markov chain Monte Carlo samplers strongly depends on the properties of the target distribution such as its covariance structure, the location of its probability mass and its tail behavior. We explore the use of bijective affine transformations of the sample space to improve the properties of the target distribution and thereby the performance of samplers running in the transformed space. In particular, we propose a flexible and user-friendly scheme for adaptively learning the affine transformation during sampling. Moreover, the combination of our scheme with Gibbsian polar slice sampling is shown to produce samples of high quality at comparatively low computational cost in several settings based on real-world data.

experiment, iteration, sampler, (12 more...)

arXiv.org Machine Learning

2401.16567

Country:

North America > United States > Wisconsin (0.04)
Europe > Germany (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report > New Finding (0.46)

Industry: Health & Medicine > Therapeutic Area (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.60)

The Reasoning Under Uncertainty Trap: A Structural AI Risk

Pilditch, Toby D.

This report examines a novel risk associated with current (and projected) AI tools. Making effective decisions about future actions requires us to reason under uncertainty (RUU), and doing so is essential to many critical real world problems. Overfaced by this challenge, there is growing demand for AI tools like LLMs to assist decision-makers. Having evidenced this demand and the incentives behind it, we expose a growing risk: we 1) do not currently sufficiently understand LLM capabilities in this regard, and 2) have no guarantees of performance given fundamental computational explosiveness and deep uncertainty constraints on accuracy. This report provides an exposition of what makes RUU so challenging for both humans and machines, and relates these difficulties to prospective AI timelines and capabilities. Having established this current potential misuse risk, we go on to expose how this seemingly additive risk (more misuse additively contributed to potential harm) in fact has multiplicative properties. Specifically, we detail how this misuse risk connects to a wider network of underlying structural risks (e.g., shifting incentives, limited transparency, and feedback loops) to produce non-linear harms. We go on to provide a solutions roadmap that targets multiple leverage points in the structure of the problem. This includes recommendations for all involved actors (prospective users, developers, and policy-makers) and enfolds insights from areas including Decision-making Under Deep Uncertainty and complex systems theory. We argue this report serves not only to raise awareness (and subsequently mitigate/correct) of a current, novel AI risk, but also awareness of the underlying class of structural risks by illustrating how their interconnected nature poses twin-dangers of camouflaging their presence, whilst amplifying their potential effects.

ai tool, incentive, reasoning, (14 more...)

2402.01743

Country:

Europe > United Kingdom > England > Oxfordshire > Oxford (0.14)
North America > United States > California > San Francisco County > San Francisco (0.14)
South America (0.04)
(12 more...)

Genre: Research Report (1.00)

Industry:

Law (1.00)
Health & Medicine (1.00)
Government > Regional Government > North America Government > United States Government (0.92)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
(2 more...)

Ziaei, Navid, Nazari, Behzad, Yousefi, Ali

A Discriminative Bayesian Gaussian Process Latent Variable Model for High-Dimensional Data

Extracting meaningful information from high-dimensional data poses a formidable modeling challenge, particularly when the data is obscured by noise or represented through different modalities. In this research, we propose a novel non-parametric modeling approach, leveraging the Gaussian Process (GP), to characterize high-dimensional data by mapping it to a latent low-dimensional manifold. This model, named the Latent Discriminative Generative Decoder (LDGD), utilizes both the data (or its features) and associated labels (such as category or stimulus) in the manifold discovery process. To infer the latent variables, we derive a Bayesian solution, allowing LDGD to effectively capture inherent uncertainties in the data while enhancing the model's predictive accuracy and robustness. We demonstrate the application of LDGD on both synthetic and benchmark datasets. Not only does LDGD infer the manifold accurately, but its prediction accuracy in anticipating labels surpasses state-of-the-art approaches. We have introduced inducing points to reduce the computational complexity of Gaussian Processes (GPs) for large datasets. This enhancement facilitates batch training, allowing for more efficient processing and scalability in handling extensive data collections. Additionally, we illustrate that LDGD achieves higher accuracy in predicting labels and operates effectively with a limited training dataset, underscoring its efficiency and effectiveness in scenarios where data availability is constrained. These attributes set the stage for the development of non-parametric modeling approaches in the analysis of high-dimensional data; especially in fields where data are both high-dimensional and complex.

dataset, dimension, logp, (9 more...)

2401.16497

Country:

North America > United States (0.14)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)

Genre: Research Report > Promising Solution (0.34)

Industry: Health & Medicine (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.93)

Weng, Yidou, Doshi-Velez, Finale

Semi-parametric Expert Bayesian Network Learning with Gaussian Processes and Horseshoe Priors

This paper proposes a model learning Semi-parametric relationships in an Expert Bayesian Network (SEBN) with linear parameter and structure constraints. We use Gaussian Processes and a Horseshoe prior to introduce minimal nonlinear components. To prioritize modifying the expert graph over adding new edges, we optimize differential Horseshoe scales. In real-world datasets with unknown truth, we generate diverse graphs to accommodate user input, addressing identifiability issues and enhancing interpretability. Evaluation on synthetic and UCI Liver Disorders datasets, using metrics like structural Hamming Distance and test likelihood, demonstrates our models outperform state-of-the-art semi-parametric Bayesian Network model.

likelihood, node, test likelihood, (13 more...)

2401.16419

Country: Asia > Singapore (0.04)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Croisfelt, Victor, Pandey, Shashi Raj, Simeone, Osvaldo, Popovski, Petar

ARQ for Active Learning at the Edge

Conventional retransmission (ARQ) protocols are designed with the goal of ensuring the correct reception of all the individual transmitter's packets at the receiver. When the transmitter is a learner communicating with a teacher, this goal is at odds with the actual aim of the learner, which is that of eliciting the most relevant label information from the teacher. Taking an active learning perspective, this paper addresses the following key protocol design questions: (i) Active batch selection: Which batch of inputs should be sent to the teacher to acquire the most useful information and thus reduce the number of required communication rounds? (ii) Batch encoding: Can batches of data points be combined to reduce the communication resources required at each communication round? Specifically, this work introduces Communication-Constrained Bayesian Active Knowledge Distillation (CC-BAKD), a novel protocol that integrates Bayesian active learning with compression via a linear mix-up mechanism. Comparisons with existing active learning protocols demonstrate the advantages of the proposed approach.

batch, communication round, learner, (13 more...)

2311.08053

Country:

Europe > Denmark > North Jutland > Aalborg (0.05)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Africa > Rwanda > Kigali > Kigali (0.04)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.47)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.35)

Tzikas, Alexandros E., Romao, Licio, Pilanci, Mert, Abate, Alessandro, Kochenderfer, Mykel J.

Distributed Markov Chain Monte Carlo Sampling based on the Alternating Direction Method of Multipliers

arXiv.org Artificial IntelligenceJan-28-2024

Many machine learning applications require operating on a spatially distributed dataset. Despite technological advances, privacy considerations and communication constraints may prevent gathering the entire dataset in a central unit. In this paper, we propose a distributed sampling scheme based on the alternating direction method of multipliers, which is commonly used in the optimization literature due to its fast convergence. In contrast to distributed optimization, distributed sampling allows for uncertainty quantification in Bayesian inference tasks. We provide both theoretical guarantees of our algorithm's convergence and experimental evidence of its superiority to the state-of-the-art. For our theoretical results, we use convex optimization tools to establish a fundamental inequality on the generated local sample iterates. This inequality enables us to show convergence of the distribution associated with these iterates to the underlying target distribution in Wasserstein distance. In simulation, we deploy our algorithm on linear and logistic regression tasks and illustrate its fast convergence compared to existing gradient-based methods.

algorithm, d-admm, d-sghmc, (16 more...)

2401.15838

Country:

North America > United States > California > Santa Clara County > Stanford (0.04)
North America > United States > California > Santa Clara County > Palo Alto (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
(3 more...)

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.67)

Dossou, Bonaventure F. P.

A Study of Acquisition Functions for Medical Imaging Deep Active Learning

arXiv.org Artificial IntelligenceJan-28-2024

The Deep Learning revolution has enabled groundbreaking achievements in recent years. From breast cancer detection to protein folding, deep learning algorithms have been at the core of very important advancements. However, these modern advancements are becoming more and more data-hungry, especially on labeled data whose availability is scarce: this is even more prevalent in the medical context. In this work, we show how active learning could be very effective in data scarcity situations, where obtaining labeled data (or annotation budget is very limited). We compare several selection criteria (BALD, MeanSTD, and MaxEntropy) on the ISIC 2016 dataset. We also explored the effect of acquired pool size on the model's performance. Our results suggest that uncertainty is useful to the Melanoma detection task, and confirms the hypotheses of the author of the paper of interest, that \textit{bald} performs on average better than other acquisition functions. Our extended analyses however revealed that all acquisition functions perform badly on the positive (cancerous) samples, suggesting exploitation of class unbalance, which could be crucial in real-world settings. We finish by suggesting future work directions that would be useful to improve this current work. The code of our implementation is open-sourced at \url{https://github.com/bonaventuredossou/ece526_course_project}

acquisition function, learning, query size, (13 more...)

2401.15721

Country:

North America > Canada > Quebec > Montreal (0.04)
Asia > Middle East > UAE (0.04)

Genre: Research Report > New Finding (0.86)

Industry:

Health & Medicine > Therapeutic Area > Oncology (0.90)
Health & Medicine > Diagnostic Medicine > Imaging (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.47)

arXiv.org Artificial IntelligenceJan-28-2024

Deep Learning for Gamma-Ray Bursts: A data driven event framework for X/Gamma-Ray analysis in space telescopes

Crupi, Riccardo

The HERMES (High Energy Rapid Modular Ensemble of Satellites) Pathfinder mission serves as an in-orbit demonstration of a constellation of nanosatellites whose primary scientific purpose is to discover intense high-energy transients, such as gamma-ray bursts, across a broad energy range (few keV to few MeV) with unparalleled temporal precision and exact localisation. By 2024, the first constellation of six nanosatellites is expected to be launched. To fully exploit satellite data and allow faint astronomical events to emerge, a precise estimation of satellite background count rates is required to determine whether the event is statistically valid or not. The dynamics of the background are related to the satellite's orbital information, which varies in the order of minutes, potentially hiding long transient events. This work introduces two main contributions I have brought ahead; first a novel background estimator is presented that could potentially be fitted to any type of X/Gamma-ray satellite space telescope, capable of capturing long-term dynamics and accurate enough to detect faint transients. This estimator is built using a Neural Network and tested on data from the Fermi Gamma-ray Space Telescope's Gamma Burst Monitor (GBM). As a second objective, it is employed a trigger algorithm, called FOCuS (Functional Online CUSUM), to extract events from the background using the background estimator. The resulting framework, DeepGRB, can identify astronomical events that are both present and absent from the Fermi-GBM catalog. The analysis of the discovered events reveals the strengths and weaknesses of the framework.

neural network architecture, nuclear instrument and method, right ascension and declination, (17 more...)

2401.15632

Country:

Oceania > Australia (0.04)
North America > United States > New Mexico > Los Alamos County > Los Alamos (0.04)
Europe > Italy > Friuli Venezia Giulia > Trieste Province > Trieste (0.04)
(15 more...)

Genre:

Research Report > New Finding (1.00)
Overview (1.00)
Research Report > Experimental Study (0.92)

Industry:

Government > Regional Government > North America Government > United States Government (1.00)
Government > Military (1.00)
Energy (1.00)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
(2 more...)