AITopics

Country:

North America > Canada > Ontario > Toronto (0.04)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)

Industry: Health & Medicine (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.70)

Dustin Tran, Mike Dusenberry, Mark van der Wilk, Danijar Hafner

Bayesian Layers: A Module for Neural Network Uncertainty

Neural Information Processing SystemsOct-2-2025, 05:00:55 GMT

We describe Bayesian Layers, a module designed for fast experimentation with neural network uncertainty. It extends neural network libraries with drop-in replacements for common layers. This enables composition via a unified abstraction over deterministic and stochastic functions and allows for scalability via the underlying system. These layers capture uncertainty over weights (Bayesian neural nets), pre-activation units (dropout), activations ("stochastic output layers"), or the function itself (Gaussian processes). They can also be reversible to propagate uncertainty from input to output. We include code examples for common architectures such as Bayesian LSTMs, deep GPs, and flow-based models. As demonstration, we fit a 5-billion parameter "Bayesian Transformer" on 512 TPUv2 cores for uncertainty in machine translation and a Bayesian dynamics model for model-based planning. Finally, we show how Bayesian Layers can be used within the Edward2 language for probabilistic programming with stochastic processes.

artificial intelligence, bayesian layer, machine learning, (16 more...)

Country: North America > Canada (0.28)

Industry: Health & Medicine (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Neural Information Processing SystemsJan-21-2025, 21:25:59 GMT

Reviews: Bayesian Layers: A Module for Neural Network Uncertainty

I am still voting for acceptance of this paper. This paper is about a software component, called Bayesian Layers, that allows for consistent creation of deep layers that are associated with some form of uncertainty or stochasticity. The paper outlines the design philosophy and principles, shows many examples and concludes with new demonstrations of Bayesian neural network applications. I find that this work is on a significant topic, since software for Bayesian (deep) learning models significantly lacks behind. Integration and drop-in replacement with traditional architectures seems like the right avenue to pursue, and is a strong motivation point for this approach. I also think that this work is sufficiently original, related to what one could expect form a software component.

bayesian layer, neural network uncertainty, software component, (13 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.56)

Neural Information Processing SystemsJan-21-2025, 21:25:48 GMT

Reviews: Bayesian Layers: A Module for Neural Network Uncertainty

This work was debated controversially among the reviewers. They all agreed that the work was presented well, and both the idea of the paper and how it is realised as a software interface are novel (or at least a clear improvement over existing frameworks). Software packages generally struggle to get accepted at major conferences. I would thus like to throw my own vote in for this paper. It is true that the community has not yet developed a good and consistent way to evaluate software contributions, in particular vis-a-vis theoretical and empirical papers. But it is high time that our community becomes more professional in software development.

bayesian layer, neural network uncertainty, software toolkit, (4 more...)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.52)
Information Technology > Software Engineering (0.41)

Jamoussi, Nour, Serra, Giuseppe, Stavrou, Photios A., Kountouris, Marios

BA-BFL: Barycentric Aggregation for Bayesian Federated Learning

arXiv.org Artificial IntelligenceDec-16-2024

In this work, we study the problem of aggregation in the context of Bayesian Federated Learning (BFL). Using an information geometric perspective, we interpret the BFL aggregation step as finding the barycenter of the trained posteriors for a pre-specified divergence metric. We study the barycenter problem for the parametric family of $\alpha$-divergences and, focusing on the standard case of independent and Gaussian distributed parameters, we recover the closed-form solution of the reverse Kullback-Leibler barycenter and develop the analytical form of the squared Wasserstein-2 barycenter. Considering a non-IID setup, where clients possess heterogeneous data, we analyze the performance of the developed algorithms against state-of-the-art (SOTA) Bayesian aggregation methods in terms of accuracy, uncertainty quantification (UQ), model calibration (MC), and fairness. Finally, we extend our analysis to the framework of Hybrid Bayesian Deep Learning (HBDL), where we study how the number of Bayesian layers in the architecture impacts the considered performance metrics. Our experimental results show that the proposed methodology presents comparable performance with the SOTA while offering a geometric interpretation of the aggregation phase.

artificial intelligence, bayesian inference, machine learning, (14 more...)

2412.11646

Country:

North America > United States > New York > New York County > New York City (0.04)
Europe > Spain > Andalusia > Granada Province > Granada (0.04)
Europe > France (0.04)
(2 more...)

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.89)

Neural Information Processing SystemsOct-9-2024, 14:02:43 GMT

Bayesian Layers: A Module for Neural Network Uncertainty

We describe Bayesian Layers, a module designed for fast experimentation with neural network uncertainty. It extends neural network libraries with drop-in replacements for common layers. This enables composition via a unified abstraction over deterministic and stochastic functions and allows for scalability via the underlying system. These layers capture uncertainty over weights (Bayesian neural nets), pre-activation units (dropout), activations (stochastic output layers''), or the function itself (Gaussian processes). They can also be reversible to propagate uncertainty from input to output.

bayesian layer, module, neural network uncertainty

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

arXiv.org Artificial IntelligenceFeb-8-2024

Domain Generalization with Small Data

Chen, Kecheng, Gal, Elena, Yan, Hong, Li, Haoliang

In this work, we propose to tackle the problem of domain generalization in the context of \textit{insufficient samples}. Instead of extracting latent feature embeddings based on deterministic models, we propose to learn a domain-invariant representation based on the probabilistic framework by mapping each data point into probabilistic embeddings. Specifically, we first extend empirical maximum mean discrepancy (MMD) to a novel probabilistic MMD that can measure the discrepancy between mixture distributions (i.e., source domains) consisting of a series of latent distributions rather than latent points. Moreover, instead of imposing the contrastive semantic alignment (CSA) loss based on pairs of latent points, a novel probabilistic CSA loss encourages positive probabilistic embedding pairs to be closer while pulling other negative ones apart. Benefiting from the learned representation captured by probabilistic models, our proposed method can marriage the measurement on the \textit{distribution over distributions} (i.e., the global perspective alignment) and the distribution-based contrastive semantic alignment (i.e., the local perspective alignment). Extensive experimental results on three challenging medical datasets show the effectiveness of our proposed method in the context of insufficient data compared with state-of-the-art methods.

classification, generalization, source domain, (16 more...)

2402.0615

Country:

Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
Asia > Middle East > Jordan (0.04)
Asia > Middle East > Israel (0.04)
Asia > China > Hong Kong > Kowloon (0.04)

Genre: Research Report (0.70)

Industry: Health & Medicine > Therapeutic Area (0.94)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
(2 more...)

Lee, Jing Yang, Lee, Kong Aik, Gan, Woon-Seng

An Empirical Bayes Framework for Open-Domain Dialogue Generation

arXiv.org Artificial IntelligenceNov-17-2023

To engage human users in meaningful conversation, open-domain dialogue agents are required to generate diverse and contextually coherent dialogue. Despite recent advancements, which can be attributed to the usage of pretrained language models, the generation of diverse and coherent dialogue remains an open research problem. A popular approach to address this issue involves the adaptation of variational frameworks. However, while these approaches successfully improve diversity, they tend to compromise on contextual coherence. Hence, we propose the Bayesian Open-domain Dialogue with Empirical Bayes (BODEB) framework, an empirical bayes framework for constructing an Bayesian open-domain dialogue agent by leveraging pretrained parameters to inform the prior and posterior parameter distributions. Empirical results show that BODEB achieves better results in terms of both diversity and coherence compared to variational frameworks.

coherence, computational linguistic, proceedings, (16 more...)

2311.10945

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > United States > New York > New York County > New York City (0.04)
Oceania > Australia > Victoria > Melbourne (0.04)
(12 more...)

Genre: Research Report > New Finding (0.88)

arXiv.org Artificial IntelligenceDec-12-2021

Spatial-Temporal-Fusion BNN: Variational Bayesian Feature Layer

Lei, Shiye, Tu, Zhuozhuo, Rutkowski, Leszek, Zhou, Feng, Shen, Li, He, Fengxiang, Tao, Dacheng

Bayesian neural networks (BNNs) have become a principal approach to alleviate overconfident predictions in deep learning, but they often suffer from scaling issues due to a large number of distribution parameters. In this paper, we discover that the first layer of a deep network possesses multiple disparate optima when solely retrained. This indicates a large posterior variance when the first layer is altered by a Bayesian layer, which motivates us to design a spatial-temporal-fusion BNN (STF-BNN) for efficiently scaling BNNs to large models: (1) first normally train a neural network from scratch to realize fast training; and (2) the first layer is converted to Bayesian and inferred by employing stochastic variational inference, while other layers are fixed. Compared to vanilla BNNs, our approach can greatly reduce the training time and the number of parameters, which contributes to scale BNNs efficiently. We further provide theoretical guarantees on the generalizability and the capability of mitigating overconfidence of STF-BNN. Comprehensive experiments demonstrate that STF-BNN (1) achieves the state-of-the-art performance on prediction and uncertainty quantification; (2) significantly improves adversarial robustness and privacy preservation; and (3) considerably reduces training time and memory costs.

international conference, neural network, stf-bnn, (16 more...)

2112.06281

Country:

North America > Canada > Ontario > Toronto (0.14)
Asia > China > Beijing > Beijing (0.04)
Oceania > Australia (0.04)
(4 more...)

Genre: Research Report > New Finding (0.46)

Industry: Information Technology > Security & Privacy (1.00)

Tran, Dustin, Dusenberry, Mike, Wilk, Mark van der, Hafner, Danijar

Bayesian Layers: A Module for Neural Network Uncertainty

Neural Information Processing SystemsMar-19-2020, 02:46:06 GMT

We describe Bayesian Layers, a module designed for fast experimentation with neural network uncertainty. It extends neural network libraries with drop-in replacements for common layers. This enables composition via a unified abstraction over deterministic and stochastic functions and allows for scalability via the underlying system. These layers capture uncertainty over weights (Bayesian neural nets), pre-activation units (dropout), activations ( stochastic output layers''), or the function itself (Gaussian processes). They can also be reversible to propagate uncertainty from input to output.

artificial intelligence, bayesian layer, machine learning, (2 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)