AITopics

2412.14477

Country:

North America > United States > Illinois > Cook County > Chicago (0.04)
South America (0.04)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
(3 more...)

Genre: Research Report (1.00)

Industry:

Health & Medicine > Therapeutic Area > Oncology (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (0.67)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.87)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.67)

arXiv.org Artificial IntelligenceDec-17-2024

RDPI: A Refine Diffusion Probability Generation Method for Spatiotemporal Data Imputation

Liu, Zijin, Zhao, Xiang, Song, You

Spatiotemporal data imputation plays a crucial role in various fields such as traffic flow monitoring, air quality assessment, and climate prediction. However, spatiotemporal data collected by sensors often suffer from temporal incompleteness, and the sparse and uneven distribution of sensors leads to missing data in the spatial dimension. Among existing methods, autoregressive approaches are prone to error accumulation, while simple conditional diffusion models fail to adequately capture the spatiotemporal relationships between observed and missing data. To address these issues, we propose a novel two-stage Refined Diffusion Probability Impuation (RDPI) framework based on an initial network and a conditional diffusion model. In the initial stage, deterministic imputation methods are used to generate preliminary estimates of the missing data. In the refinement stage, residuals are treated as the diffusion target, and observed values are innovatively incorporated into the forward process. This results in a conditional diffusion model better suited for spatiotemporal data imputation, bridging the gap between the preliminary estimates and the true values. Experiments on multiple datasets demonstrate that RDPI not only achieves state-of-the-art imputation accuracy but also significantly reduces sampling computational costs.

data quality, imputation, machine learning, (20 more...)

2412.12642

Country:

Asia > China > Beijing > Beijing (0.04)
North America > United States > California > Los Angeles County > Los Angeles (0.04)
North America > Trinidad and Tobago > Trinidad > Arima > Arima (0.04)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Data Science > Data Quality (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.94)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Ndaoud, Mohamed, Radchenko, Peter, Rava, Bradley

Ask for More Than Bayes Optimal: A Theory of Indecisions for Classification

arXiv.org Machine LearningDec-17-2024

In this work, we address the problem of controlling a classifier's accuracy at any user-specified level through selective classification, regardless of the problem's inherent difficulty. Traditional classification frameworks are designed to approximate the Bayes optimal error rate as closely as possible. However, with the growing deployment of artificial intelligence (AI) systems in automated, high-stakes decision-making, it has become critical to ensure reliable control over a classifier's accuracy and to guarantee accurate predictions for all individuals. When the underlying problem is truly difficult, as indicated by the distance between the true distributions for each decision class, achieving control over the error rate of an automated decisionmaking system may be impossible. This is particularly true when the number of potential classes is large or when the distributions of these classes are close enough, significantly increasing the difficulty of the problem. This phenomenon is illustrated in Figure 1, where the task is to classify various observations as High-Risk or Low-Risk, while maintaining an error rate below 5%. In this example, the High-Risk and Low-Risk classes are modeled as mixtures of two normal distributions with means of 2 and 1, respectively, and a shared variance of 1. The Bayes classifier is represented by the dotted line in the leftmost plot of Figure 1. In this scenario, the Bayes optimal error rate is 15.9%, significantly exceeding our target classification error of 5%.

artificial intelligence, indecision, machine learning, (18 more...)

2412.12807

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.66)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.66)

arXiv.org Machine LearningDec-17-2024

Adaptive Nonparametric Perturbations of Parametric Bayesian Models

Wu, Bohan, Weinstein, Eli N., Salehi, Sohrab, Wang, Yixin, Blei, David M.

Parametric Bayesian modeling offers a powerful and flexible toolbox for scientific data analysis. Yet the model, however detailed, may still be wrong, and this can make inferences untrustworthy. In this paper we study nonparametrically perturbed parametric (NPP) Bayesian models, in which a parametric Bayesian model is relaxed via a distortion of its likelihood. We analyze the properties of NPP models when the target of inference is the true data distribution or some functional of it, such as in causal inference. We show that NPP models can offer the robustness of nonparametric models while retaining the data efficiency of parametric models, achieving fast convergence when the parametric model is close to true. To efficiently analyze data with an NPP model, we develop a generalized Bayes procedure to approximate its posterior. We demonstrate our method by estimating causal effects of gene expression from single cell RNA sequencing data. NPP modeling offers an efficient approach to robust Bayesian inference and can be used to robustify any parametric Bayesian model.

artificial intelligence, machine learning, parametric model, (17 more...)

2412.10683

Country:

North America > United States > Michigan > Washtenaw County > Ann Arbor (0.14)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > Middle East > Jordan (0.04)
(2 more...)

Genre: Research Report (0.63)

Industry:

Health & Medicine > Therapeutic Area > Oncology (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Patel, Zeeshan, DeLoye, James, Mathias, Lance

Exploring Diffusion and Flow Matching Under Generator Matching

arXiv.org Artificial IntelligenceDec-17-2024

Recent techniques in deep generative modeling have leveraged Markov generative processes to learn complex, high-dimensional probability distributions in a more structured and flexible manner [17]. By integrating Markov chain methods with deep neural architectures, these approaches aim to exploit the representational power of deep networks while maintaining a tractable and theoretically grounded training procedure. In contrast to early generative models that relied heavily on direct maximum likelihood estimation or adversarial objectives, this class of methods employs iterative stochastic transformations--often expressed as Markovian updates--to gradually refine initial noise samples into samples drawn from the desired target distribution. Diffusion and flow matching models represent two prominent classes of generative approaches that construct data samples through a sequence of continuous transformations. Diffusion models [6, 13] introduce a forward-noising and reverse-denoising process, progressively refining a simple noise distribution into a complex target distribution by learning to undo incremental noise corruption at each step.

artificial intelligence, generator, machine learning, (15 more...)

2412.11024

Country:

North America > United States > California > Alameda County > Berkeley (0.04)
Europe > France > Hauts-de-France > Nord > Lille (0.04)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.68)

Díaz-Pachón, Daniel Andrés, Gallegos, H. Renata, Hössjer, Ola, Rao, J. Sunil

Statistical learning does not always entail knowledge

arXiv.org Machine LearningDec-17-2024

In this paper, we study learning and knowledge acquisition (LKA) of an agent about a proposition that is either true or false. We use a Bayesian approach, where the agent receives data to update his beliefs about the proposition according to a posterior distribution. The LKA is formulated in terms of active information, with data representing external or exogenous information that modifies the agent's beliefs. It is assumed that data provide details about a number of features that are relevant to the proposition. We show that this leads to a Gibbs distribution posterior, which is in maximum entropy relative to the prior, conditioned on the side constraints that the data provide in terms of the features. We demonstrate that full learning is sometimes not possible and full knowledge acquisition is never possible when the number of extracted features is too small. We also distinguish between primary learning (receiving data about features of relevance for the proposition) and secondary learning (receiving data about the learning of another agent). We argue that this type of secondary learning does not represent true knowledge acquisition. Our results have implications for statistical learning algorithms, and we claim that such algorithms do not always generate true knowledge. The theory is illustrated with several examples.

agent, knowledge, knowledge acquisition, (14 more...)

2501.01963

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
North America > United States > New York (0.04)
North America > United States > Minnesota (0.04)
(2 more...)

Genre: Research Report > New Finding (0.34)

Industry: Health & Medicine > Therapeutic Area (0.67)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.88)

Roy, Shreya Sinha, Everitt, Richard G., Robert, Christian P., Dutta, Ritabrata

Generalized Bayesian deep reinforcement learning

arXiv.org Machine LearningDec-16-2024

Bayesian reinforcement learning (BRL) is a method that merges principles from Bayesian statistics and reinforcement learning to make optimal decisions in uncertain environments. Similar to other model-based RL approaches, it involves two key components: (1) Inferring the posterior distribution of the data generating process (DGP) modeling the true environment and (2) policy learning using the learned posterior. We propose to model the dynamics of the unknown environment through deep generative models assuming Markov dependence. In absence of likelihood functions for these models we train them by learning a generalized predictive-sequential (or prequential) scoring rule (SR) posterior. We use sequential Monte Carlo (SMC) samplers to draw samples from this generalized Bayesian posterior distribution. In conjunction, to achieve scalability in the high dimensional parameter space of the neural networks, we use the gradient based Markov chain Monte Carlo (MCMC) kernels within SMC. To justify the use of the prequential scoring rule posterior we prove a Bernstein-von Misses type theorem. For policy learning, we propose expected Thompson sampling (ETS) to learn the optimal policy by maximizing the expected value function with respect to the posterior distribution. This improves upon traditional Thompson sampling (TS) and its extensions which utilize only one sample drawn from the posterior distribution. This improvement is studied both theoretically and using simulation studies assuming discrete action and state-space. Finally we successfully extend our setup for a challenging problem with continuous action space without theoretical guarantees.

machine learning, posterior, reinforcement learning, (18 more...)

2412.11743

Country: Europe > United Kingdom (0.46)

Genre: Research Report (0.40)

Industry:

Leisure & Entertainment > Games (0.71)
Energy > Oil & Gas > Upstream (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
(2 more...)

Jamoussi, Nour, Serra, Giuseppe, Stavrou, Photios A., Kountouris, Marios

BA-BFL: Barycentric Aggregation for Bayesian Federated Learning

arXiv.org Artificial IntelligenceDec-16-2024

In this work, we study the problem of aggregation in the context of Bayesian Federated Learning (BFL). Using an information geometric perspective, we interpret the BFL aggregation step as finding the barycenter of the trained posteriors for a pre-specified divergence metric. We study the barycenter problem for the parametric family of $\alpha$-divergences and, focusing on the standard case of independent and Gaussian distributed parameters, we recover the closed-form solution of the reverse Kullback-Leibler barycenter and develop the analytical form of the squared Wasserstein-2 barycenter. Considering a non-IID setup, where clients possess heterogeneous data, we analyze the performance of the developed algorithms against state-of-the-art (SOTA) Bayesian aggregation methods in terms of accuracy, uncertainty quantification (UQ), model calibration (MC), and fairness. Finally, we extend our analysis to the framework of Hybrid Bayesian Deep Learning (HBDL), where we study how the number of Bayesian layers in the architecture impacts the considered performance metrics. Our experimental results show that the proposed methodology presents comparable performance with the SOTA while offering a geometric interpretation of the aggregation phase.

artificial intelligence, bayesian inference, machine learning, (14 more...)

2412.11646

Country:

North America > United States > New York > New York County > New York City (0.04)
Europe > Spain > Andalusia > Granada Province > Granada (0.04)
Europe > France (0.04)
(2 more...)

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.89)

Butter, Anja, Charton, François, Villadamigo, Javier Mariño, Ore, Ayodele, Plehn, Tilman, Spinner, Jonas

Extrapolating Jet Radiation with Autoregressive Transformers

arXiv.org Artificial IntelligenceDec-16-2024

Generative networks are an exciting tool for fast LHC event generation. Usually, they are used to generate configurations with a fixed number of particles. Autoregressive transformers allow us to generate events with variable numbers of particles, very much in line with the physics of QCD jet radiation. We show how they can learn a factorized likelihood for jet radiation and extrapolate in terms of the number of generated jets. For this extrapolation, bootstrapping training data and training with modifications of the likelihood loss can be used.

artificial intelligence, bayesian inference, machine learning, (16 more...)

2412.12074

Country:

North America > Canada > Ontario > Toronto (0.04)
Europe > Germany > Baden-Württemberg > Karlsruhe Region > Heidelberg (0.04)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
(2 more...)

Genre: Research Report (0.63)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.48)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.46)

Bills, Joseph, Archibald, Christopher, Blaylock, Diego

Improving Cooperation in Language Games with Bayesian Inference and the Cognitive Hierarchy

arXiv.org Artificial IntelligenceDec-16-2024

In two-player cooperative games, agents can play together effectively when they have accurate assumptions about how their teammate will behave, but may perform poorly when these assumptions are inaccurate. In language games, failure may be due to disagreement in the understanding of either the semantics or pragmatics of an utterance. We model coarse uncertainty in semantics using a prior distribution of language models and uncertainty in pragmatics using the cognitive hierarchy, combining the two aspects into a single prior distribution over possible partner types. Fine-grained uncertainty in semantics is modeled using noise that is added to the embeddings of words in the language. To handle all forms of uncertainty we construct agents that learn the behavior of their partner using Bayesian inference and use this information to maximize the expected value of a heuristic function. We test this approach by constructing Bayesian agents for the game of Codenames, and show that they perform better in experiments where semantics is uncertain

guesser, machine learning, natural language, (21 more...)

2412.12409

Country:

Europe > Denmark > Capital Region > Copenhagen (0.04)
Asia > Middle East > Qatar > Ad-Dawhah > Doha (0.04)

Genre: Research Report (0.64)

Industry: Leisure & Entertainment > Games (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)