AITopics

2408.05058

Country:

North America > United States > California > Los Angeles County > Los Angeles (0.28)
North America > United States > Washington > King County > Seattle (0.14)
Asia > Middle East > Jordan (0.04)
(2 more...)

Genre: Research Report (0.40)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.47)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Zhu, Rong J. B., Qiu, Yanqi

UCB Exploration for Fixed-Budget Bayesian Best Arm Identification

arXiv.org Machine LearningAug-9-2024

We study best-arm identification (BAI) in the fixed-budget setting. Adaptive allocations based on upper confidence bounds (UCBs), such as UCBE, are known to work well in BAI. However, it is well-known that its optimal regret is theoretically dependent on instances, which we show to be an artifact in many fixed-budget BAI problems. In this paper we propose an UCB exploration algorithm that is both theoretically and empirically efficient for the fixed budget BAI problem under a Bayesian setting. The key idea is to learn prior information, which can enhance the performance of UCB-based BAI algorithm as it has done in the cumulative regret minimization problem. We establish bounds on the failure probability and the simple regret for the Bayesian BAI problem, providing upper bounds of order $\tilde{O}(\sqrt{K/n})$, up to logarithmic factors, where $n$ represents the budget and $K$ denotes the number of arms. Furthermore, we demonstrate through empirical results that our approach consistently outperforms state-of-the-art baselines.

algorithm, failure probability, probability, (14 more...)

2408.04869

Country:

North America > United States > New York (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > China > Hubei Province > Wuhan (0.04)

Genre: Research Report (1.00)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.68)
Information Technology > Data Science > Data Mining > Big Data (0.47)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.46)

Che, Ethan, Jiang, Daniel R., Namkoong, Hongseok, Wang, Jimmy

Mathematical Programming For Adaptive Experiments

arXiv.org Artificial IntelligenceAug-8-2024

Adaptive experimentation can significantly improve statistical power, but standard algorithms overlook important practical issues including batched and delayed feedback, personalization, non-stationarity, multiple objectives, and constraints. To address these issues, the current algorithm design paradigm crafts tailored methods for each problem instance. Since it is infeasible to devise novel algorithms for every real-world instance, practitioners often have to resort to suboptimal approximations that do not address all of their challenges. Moving away from developing bespoke algorithms for each setting, we present a mathematical programming view of adaptive experimentation that can flexibly incorporate a wide range of objectives, constraints, and statistical procedures. By formulating a dynamic program in the batched limit, our modeling framework enables the use of scalable optimization methods (e.g., SGD and auto-differentiation) to solve for treatment allocations. We evaluate our framework on benchmarks modeled after practical challenges such as non-stationarity, personalization, multi-objectives, and constraints. Unlike bespoke algorithms such as modified variants of Thomson sampling, our mathematical programming approach provides remarkably robust performance across instances.

allocation, experiment, objective, (17 more...)

2408.0457

Country:

North America > United States > Georgia > Fulton County > Atlanta (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.92)

Industry: Health & Medicine (0.67)

Technology:

Information Technology > Software Engineering (1.00)
Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
(3 more...)

arXiv.org Artificial IntelligenceAug-8-2024

Knowledge-Aided Semantic Communication Leveraging Probabilistic Graphical Modeling

Wan, Haowen, Yang, Qianqian, Tang, Jiancheng, shi, Zhiguo

In this paper, we propose a semantic communication approach based on probabilistic graphical model (PGM). The proposed approach involves constructing a PGM from a training dataset, which is then shared as common knowledge between the transmitter and receiver. We evaluate the importance of various semantic features and present a PGM-based compression algorithm designed to eliminate predictable portions of semantic information. Furthermore, we introduce a technique to reconstruct the discarded semantic information at the receiver end, generating approximate results based on the PGM. Simulation results indicate a significant improvement in transmission efficiency over existing methods, while maintaining the quality of the transmitted images.

pgm, semantic information, transmission, (16 more...)

2408.04499

Country:

North America > United States > Washington > King County > Seattle (0.05)
Asia > China > Zhejiang Province > Hangzhou (0.05)
North America > United States > Utah > Salt Lake County > Salt Lake City (0.04)
(3 more...)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.73)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.70)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.70)

Schneider, Felix, Papaioannou, Iason, Sudret, Bruno, Müller, Gerhard

Maximum a Posteriori Estimation for Linear Structural Dynamics Models Using Bayesian Optimization with Rational Polynomial Chaos Expansions

arXiv.org Machine LearningAug-7-2024

Bayesian analysis enables combining prior knowledge with measurement data to learn model parameters. Commonly, one resorts to computing the maximum a posteriori (MAP) estimate, when only a point estimate of the parameters is of interest. We apply MAP estimation in the context of structural dynamic models, where the system response can be described by the frequency response function. To alleviate high computational demands from repeated expensive model calls, we utilize a rational polynomial chaos expansion (RPCE) surrogate model that expresses the system frequency response as a rational of two polynomials with complex coefficients. We propose an extension to an existing sparse Bayesian learning approach for RPCE based on Laplace's approximation for the posterior distribution of the denominator coefficients. Furthermore, we introduce a Bayesian optimization approach, which allows to adaptively enrich the experimental design throughout the optimization process of MAP estimation. Thereby, we utilize the expected improvement acquisition function as a means to identify sample points in the input space that are possibly associated with large objective function values. The acquisition function is estimated through Monte Carlo sampling based on the posterior distribution of the expansion coefficients identified in the sparse Bayesian learning process. By combining the sparsity-inducing learning procedure with the sequential experimental design, we effectively reduce the number of model evaluations in the MAP estimation problem. We demonstrate the applicability of the presented methods on the parameter updating problem of an algebraic two-degree-of-freedom system and the finite element model of a cross-laminated timber plate.

coefficient, experimental design, surrogate model, (16 more...)

2408.03569

Country:

Europe > Switzerland > Zürich > Zürich (0.14)
Europe > Germany > Bavaria > Upper Bavaria > Munich (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
(4 more...)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

arXiv.org Machine LearningAug-7-2024

Flexible Bayesian Last Layer Models Using Implicit Priors and Diffusion Posterior Sampling

Xu, Jian, Lin, Zhiqi, Li, Shigui, Chen, Min, Yang, Junmei, Zeng, Delu, Paisley, John

Bayesian Last Layer (BLL) models focus solely on uncertainty in the output layer of neural networks, demonstrating comparable performance to more complex Bayesian models. However, the use of Gaussian priors for last layer weights in Bayesian Last Layer (BLL) models limits their expressive capacity when faced with non-Gaussian, outlier-rich, or high-dimensional datasets. To address this shortfall, we introduce a novel approach that combines diffusion techniques and implicit priors for variational learning of Bayesian last layer weights. This method leverages implicit distributions for modeling weight priors in BLL, coupled with diffusion samplers for approximating true posterior predictions, thereby establishing a comprehensive Bayesian prior and posterior estimation strategy. By delivering an explicit and computationally efficient variational lower bound, our method aims to augment the expressive abilities of BLL models, enhancing model accuracy, calibration, and out-of-distribution detection proficiency. Through detailed exploration and experimental validation, We showcase the method's potential for improving predictive accuracy and uncertainty quantification while ensuring computational efficiency.

dataset, learning, neural network, (14 more...)

2408.03746

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > China (0.04)

Genre:

Research Report > Promising Solution (0.48)
Overview > Innovation (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.66)

Ye, Naimeng, Yang, Hanming, Siah, Andrew, Namkoong, Hongseok

Pre-training and in-context learning IS Bayesian inference a la De Finetti

arXiv.org Machine LearningAug-6-2024

Accurately gauging uncertainty on the underlying environment is a longstanding goal of intelligent systems. We characterize which latent concepts pre-trained sequence models are naturally able to reason with. We go back to De Finetti's predictive view of Bayesian reasoning: instead of modeling latent parameters through priors and likelihoods like topic models do, De Finetti has long advocated for modeling exchangeable (permutation invariant) sequences of observables. According to this view, pre-training autoregressive models formulates informed beliefs based on prior observations ("empirical Bayes"), and forward generation is a simulated instantiation of an environment ("posterior inference"). This connection allows extending in-context learning (ICL) beyond predictive settings, highlighting sequence models' ability to perform explicit statistical inference. In particular, we show the sequence prediction loss over exchangeable documents controls performance on downstream tasks where uncertainty quantification is key. Empirically, we propose and demonstrate several approaches for encoding exchangeability in sequence model architectures: data augmentation, regularization, and causal masking.

inference, sequence, sequence model, (15 more...)

2408.03307

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.93)

arXiv.org Artificial IntelligenceAug-5-2024

Bayesian Kolmogorov Arnold Networks (Bayesian_KANs): A Probabilistic Approach to Enhance Accuracy and Interpretability

Hassan, Masoud Muhammed

Because of its strong predictive skills, deep learning has emerged as an essential tool in many industries, including healthcare. Traditional deep learning models, on the other hand, frequently lack interpretability and omit to take prediction uncertainty into account two crucial components of clinical decision making. In order to produce explainable and uncertainty aware predictions, this study presents a novel framework called Bayesian Kolmogorov Arnold Networks (BKANs), which combines the expressive capacity of Kolmogorov Arnold Networks with Bayesian inference. We employ BKANs on two medical datasets, which are widely used benchmarks for assessing machine learning models in medical diagnostics: the Pima Indians Diabetes dataset and the Cleveland Heart Disease dataset. Our method provides useful insights into prediction confidence and decision boundaries and outperforms traditional deep learning models in terms of prediction accuracy. Moreover, BKANs' capacity to represent aleatoric and epistemic uncertainty guarantees doctors receive more solid and trustworthy decision support. Our Bayesian strategy improves the interpretability of the model and considerably minimises overfitting, which is important for tiny and imbalanced medical datasets, according to experimental results. We present possible expansions to further use BKANs in more complicated multimodal datasets and address the significance of these discoveries for future research in building reliable AI systems for healthcare. This work paves the way for a new paradigm in deep learning model deployment in vital sectors where transparency and reliability are crucial.

bayesian-kan, kolmogorov-arnold network, prediction, (15 more...)

2408.02706

Country: Asia > Middle East > Iraq > Kurdistan Region > Duhok Governorate > Zakho (0.04)

Genre: Research Report > New Finding (1.00)

Industry:

Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (0.73)
Health & Medicine > Therapeutic Area > Endocrinology > Diabetes (0.50)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.94)

Ferrer, Luciana, Ramos, Daniel

Evaluating Posterior Probabilities: Decision Theory, Proper Scoring Rules, and Calibration

arXiv.org Machine LearningAug-5-2024

Most machine learning classifiers are designed to output posterior probabilities for the classes given the input sample. These probabilities may be used to make the categorical decision on the class of the sample; provided as input to a downstream system; or provided to a human for interpretation. Evaluating the quality of the posteriors generated by these system is an essential problem which was addressed decades ago with the invention of proper scoring rules (PSRs). Unfortunately, much of the recent machine learning literature uses calibration metrics -- most commonly, the expected calibration error (ECE) -- as a proxy to assess posterior performance. The problem with this approach is that calibration metrics reflect only one aspect of the quality of the posteriors, ignoring the discrimination performance. For this reason, we argue that calibration metrics should play no role in the assessment of posterior quality. Expected PSRs should instead be used for this job, preferably normalized for ease of interpretation. In this work, we first give a brief review of PSRs from a practical perspective, motivating their definition using Bayes decision theory. We discuss why expected PSRs provide a principled measure of the quality of a system's posteriors and why calibration metrics are not the right tool for this job. We argue that calibration metrics, while not useful for performance assessment, may be used as diagnostic tools during system development. With this purpose in mind, we discuss a simple and practical calibration metric, called calibration loss, derived from a decomposition of expected PSRs. We compare this metric with the ECE and with the expected score divergence calibration metric from the PSR literature and argue, using theoretical and empirical evidence, that calibration loss is superior to these two metrics.

calibration, calibration metric, posterior, (15 more...)

2408.02841

Country:

North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
South America > Argentina > Pampas > Buenos Aires F.D. > Buenos Aires (0.04)
(8 more...)

Genre:

Overview (0.87)
Research Report (0.82)

Industry:

Education (0.86)
Leisure & Entertainment > Games (0.61)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.49)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.46)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.35)

arXiv.org Artificial IntelligenceAug-4-2024

Generalized Maximum Likelihood Estimation for Perspective-n-Point Problem

Zhan, Tian, Xu, Chunfeng, Zhang, Cheng, Zhu, Ke

The Perspective-n-Point (PnP) problem has been widely studied in the literature and applied in various vision-based pose estimation scenarios. However, existing methods ignore the anisotropy uncertainty of observations, as demonstrated in several real-world datasets in this paper. This oversight may lead to suboptimal and inaccurate estimation, particularly in the presence of noisy observations. To this end, we propose a generalized maximum likelihood PnP solver, named GMLPnP, that minimizes the determinant criterion by iterating the GLS procedure to estimate the pose and uncertainty simultaneously. Further, the proposed method is decoupled from the camera model. Results of synthetic and real experiments show that our method achieves better accuracy in common pose estimation scenarios, GMLPnP improves rotation/translation accuracy by 4.7%/2.0% on TUM-RGBD and 18.6%/18.4% on KITTI-360 dataset compared to the best baseline. It is more accurate under very noisy observations in a vision-based UAV localization task, outperforming the best baseline by 34.4% in translation estimation accuracy.

covariance, estimation, localization, (13 more...)

2408.01945

Country:

Asia > China > Beijing > Beijing (0.04)
North America > United States > New Jersey > Hudson County > Hoboken (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)