AITopics

This paper addresses the challenge of representing complex human action (HA) in a nuclear power plant (NPP) digital twin (DT) and minimizing latency in partial computation offloading (PCO) in sixth-generation-enabled computing in the network (COIN) assisted multiaccess edge computing (MEC). Accurate HA representation in the DT-HA model is vital for modeling human interventions that are crucial for the safe and efficient operation of NPPs. In this context, DT-enabled COIN-assisted MEC harnesses DT (known as a cybertwin) capabilities to optimize resource allocation and reduce latency effectively. A two-stage approach is employed to address system complexity. First, a probabilistic graphical model (PGM) is introduced to capture HAs in the DT abstraction. In the PGM, HA and NPP asset-twin abstractions form coupled systems that evolve and interact through observable data and control input. Next, the underlying PCO problem is formulated as a multiuser game, where NPP assets can partially offload tasks to COIN and MEC. We propose a decentralized algorithm to optimize offloading decisions, offloading ratios, and resource allocation. The simulation results demonstrate the effectiveness of the proposed method in capturing complex HAs and optimal resource allocation in DT-enabled NPPs.

opération, probability, subsystem, (16 more...)

2407.12011

Country:

Asia > South Korea > Daejeon > Daejeon (0.04)
Europe > Norway (0.04)
North America > United States > Utah > Salt Lake County > Salt Lake City (0.04)
(6 more...)

Genre: Research Report (0.84)

Industry:

Information Technology > Security & Privacy (1.00)
Energy > Power Industry > Utilities > Nuclear (1.00)

Technology:

Information Technology > Communications > Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.67)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.46)

Peirce in the Machine: How Mixture of Experts Models Perform Hypothesis Construction

Rushing, Bruce

Mixture of experts is a prediction aggregation method in machine learning that aggregates the predictions of specialized experts. This method often outperforms Bayesian methods despite the Bayesian having stronger inductive guarantees. We argue that this is due to the greater functional capacity of mixture of experts. We prove that in a limiting case of mixture of experts will have greater capacity than equivalent Bayesian methods, which we vouchsafe through experiments on non-limiting cases. Finally, we conclude that mixture of experts is a type of abductive reasoning in the Peircian sense of hypothesis construction.

hypothesis, moe, vc dimension, (15 more...)

2406.1715

Country:

Asia > Middle East > Jordan (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report (1.00)

Industry: Health & Medicine (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Enhancing Dropout-based Bayesian Neural Networks with Multi-Exit on FPGA

Chen, Hao Mark, Castelli, Liam, Ferianc, Martin, Zhou, Hongyu, Liu, Shuanglong, Luk, Wayne, Fan, Hongxiang

Reliable uncertainty estimation plays a crucial role in various safety-critical applications such as medical diagnosis and autonomous driving. In recent years, Bayesian neural networks (BayesNNs) have gained substantial research and industrial interests due to their capability to make accurate predictions with reliable uncertainty estimation. However, the algorithmic complexity and the resulting hardware performance of BayesNNs hinder their adoption in real-life applications. To bridge this gap, this paper proposes an algorithm and hardware co-design framework that can generate field-programmable gate array (FPGA)-based accelerators for efficient BayesNNs. At the algorithm level, we propose novel multi-exit dropout-based BayesNNs with reduced computational and memory overheads while achieving high accuracy and quality of uncertainty estimation. At the hardware level, this paper introduces a transformation framework that can generate FPGA-based accelerators for the proposed efficient multi-exit BayesNNs. Several optimization techniques such as the mix of spatial and temporal mappings are introduced to reduce resource consumption and improve the overall hardware performance. Comprehensive experiments demonstrate that our approach can achieve higher energy efficiency compared to CPU, GPU, and other state-of-the-art hardware implementations. To support the future development of this research, we have open-sourced our code at: https://github.com/os-hxfan/MCME_FPGA_Acc.git

accelerator, bayesnn, implementation, (14 more...)

2406.14593

Country:

Europe > United Kingdom > England > Greater London > London (0.14)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.14)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
(5 more...)

Genre: Research Report (0.64)

Industry: Information Technology (0.88)

Technology:

Information Technology > Hardware (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Zilberstein, Nicolas, Mardani, Morteza, Segarra, Santiago

Repulsive Score Distillation for Diverse Sampling of Diffusion Models

Score distillation sampling has been pivotal for integrating diffusion models into generation of complex visuals. Despite impressive results it suffers from mode collapse and lack of diversity. To cope with this challenge, we leverage the gradient flow interpretation of score distillation to propose Repulsive Score Distillation (RSD). In particular, we propose a variational framework based on repulsion of an ensemble of particles that promotes diversity. Using a variational approximation that incorporates a coupling among particles, the repulsion appears as a simple regularization that allows interaction of particles based on their relative pairwise similarity, measured e.g., via radial basis kernels. We design RSD for both unconstrained and constrained sampling scenarios. For constrained sampling we focus on inverse problems in the latent space that leads to an augmented variational formulation, that strikes a good balance between compute, quality and diversity. Our extensive experiments for text-to-image generation, and inverse problems demonstrate that RSD achieves a superior trade-off between diversity and quality compared with state-of-the-art alternatives.

diffusion model, inverse problem, repulsion, (14 more...)

2406.16683

Country:

North America > United States (0.28)
South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
Asia > Middle East > Jordan (0.04)
Europe > Italy > Calabria > Catanzaro Province > Catanzaro (0.04)

Genre: Research Report (0.50)

Industry: Government (0.67)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Sun, Yuyang, Lei, Jingyu, Kosmas, Panagiotis

Exploring Biomarker Relationships in Both Type 1 and Type 2 Diabetes Mellitus Through a Bayesian Network Analysis Approach

Understanding the complex relationships of biomarkers in diabetes is pivotal for advancing treatment strategies, a pressing need in diabetes research. This study applies Bayesian network structure learning to analyze the Shanghai Type 1 and Type 2 diabetes mellitus datasets, revealing complex relationships among key diabetes-related biomarkers. The constructed Bayesian network presented notable predictive accuracy, particularly for Type 2 diabetes mellitus, with root mean squared error (RMSE) of 18.23 mg/dL, as validated through leave-one-domain experiments and Clarke error grid analysis. This study not only elucidates the intricate dynamics of diabetes through a deeper understanding of biomarker interplay but also underscores the significant potential of integrating data-driven and knowledge-driven methodologies in the realm of personalized diabetes management. Such an approach paves the way for more custom and effective treatment strategies, marking a notable advancement in the field.

bayesian network, biomarker, dataset, (16 more...)

2406.1709

Country:

Asia > China > Shanghai > Shanghai (0.25)
North America > United States > California > Orange County > Irvine (0.14)
Europe > United Kingdom > England > Greater London > London (0.05)

Genre: Research Report (1.00)

Industry: Health & Medicine > Therapeutic Area > Endocrinology > Diabetes (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Datta, Jyotishka, Polson, Nicholas G.

Bayesian Deep ICE

Deep Independent Component Estimation (DICE) has many applications in modern day machine learning as a feature engineering extraction method. We provide a novel latent variable representation of independent component analysis that enables both point estimates via expectation-maximization (EM) and full posterior sampling via Markov Chain Monte Carlo (MCMC) algorithms. Our methodology also applies to flow-based methods for nonlinear feature extraction. We discuss how to implement conditional posteriors and envelope-based methods for optimization. Through this representation hierarchy, we unify a number of hitherto disjoint estimation procedures. We illustrate our methodology and algorithms on a numerical example. Finally, we conclude with directions for future research.

algorithm, mackay, source distribution, (13 more...)

2406.17058

Country:

North America > United States > Virginia (0.04)
North America > United States > Illinois > Cook County > Chicago (0.04)
Asia > Middle East > Jordan (0.04)
(3 more...)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Ricci, Federica Zoe, Sudderth, Erik B., Lee, Jaylen, Peters, Megan A. K., Vannucci, Marina, Guindani, Michele

Bayesian temporal biclustering with applications to multi-subject neuroscience studies

We consider the problem of analyzing multivariate time series collected on multiple subjects, with the goal of identifying groups of subjects exhibiting similar trends in their recorded measurements over time as well as time-varying groups of associated measurements. To this end, we propose a Bayesian model for temporal biclustering featuring nested partitions, where a time-invariant partition of subjects induces a time-varying partition of measurements. Our approach allows for data-driven determination of the number of subject and measurement clusters as well as estimation of the number and location of changepoints in measurement partitions. To efficiently perform model fitting and posterior estimation with Markov Chain Monte Carlo, we derive a blocked update of measurements' cluster-assignment sequences. We illustrate the performance of our model in two applications to functional magnetic resonance imaging data and to an electroencephalogram dataset. The results indicate that the proposed model can combine information from potentially many subjects to discover a set of interpretable, dynamic patterns. Experiments on simulated data compare the estimation performance of the proposed model against ground-truth values and other statistical methods, showing that it performs well at identifying ground-truth subject and measurement clusters even when no subject or time dependence is present.

partition, probability, sequence, (17 more...)

2406.17131

Country:

North America > United States > California > Los Angeles County > Los Angeles (0.14)
Asia > Middle East > Jordan (0.04)
North America > United States > California > Riverside County > Riverside (0.04)
North America > United States > California > Orange County > Irvine (0.04)

Genre: Research Report > New Finding (0.67)

Industry:

Health & Medicine > Therapeutic Area > Neurology (1.00)
Health & Medicine > Health Care Technology (1.00)
Health & Medicine > Diagnostic Medicine > Imaging (0.86)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.66)

van Meegen, Alexander, Sompolinsky, Haim

Coding schemes in neural networks learning classification tasks

arXiv.org Machine LearningJun-24-2024

Neural networks posses the crucial ability to generate meaningful representations of task-dependent features. Indeed, with appropriate scaling, supervised learning in neural networks can result in strong, task-dependent feature learning. However, the nature of the emergent representations, which we call the `coding scheme', is still unclear. To understand the emergent coding scheme, we investigate fully-connected, wide neural networks learning classification tasks using the Bayesian framework where learning shapes the posterior distribution of the network weights. Consistent with previous findings, our analysis of the feature learning regime (also known as `non-lazy', `rich', or `mean-field' regime) shows that the networks acquire strong, data-dependent features. Surprisingly, the nature of the internal representations depends crucially on the neuronal nonlinearity. In linear networks, an analog coding scheme of the task emerges. Despite the strong representations, the mean predictor is identical to the lazy case. In nonlinear networks, spontaneous symmetry breaking leads to either redundant or sparse coding schemes. Our findings highlight how network properties such as scaling of weights and neuronal nonlinearity can profoundly influence the emergent representations.

neuron, posterior, readout weight, (16 more...)

arXiv.org Machine Learning

2406.16689

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
North America > United States > Pennsylvania > Philadelphia County > Philadelphia (0.04)
North America > United States > New York (0.04)
(4 more...)

Genre: Research Report > New Finding (0.34)

Industry: Health & Medicine > Therapeutic Area (0.45)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.48)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.48)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

arXiv.org Machine LearningJun-24-2024

Greedy equivalence search for nonparametric graphical models

Aragam, Bryon

One of the hallmark achievements of the theory of graphical models and Bayesian model selection is the celebrated greedy equivalence search (GES) algorithm due to Chickering and Meek. GES is known to consistently estimate the structure of directed acyclic graph (DAG) models in various special cases including Gaussian and discrete models, which are in particular curved exponential families. A general theory that covers general nonparametric DAG models, however, is missing. Here, we establish the consistency of greedy equivalence search for general families of DAG models that satisfy smoothness conditions on the Markov factorization, and hence may not be curved exponential families, or even parametric. The proof leverages recent advances in nonparametric Bayes to construct a test for comparing misspecified DAG models that avoids arguments based on the Laplace approximation. Nonetheless, when the Laplace approximation is valid and a consistent scoring function exists, we recover the classical result. As a result, we obtain a general consistency theorem for GES applied to general DAG models.

algorithm, consistency, graphical model, (13 more...)

arXiv.org Machine Learning

2406.17228

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
North America > United States > New York (0.04)
North America > United States > Illinois > Cook County > Chicago (0.04)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)

Genre:

Research Report (0.64)
Overview (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (0.91)

arXiv.org Machine LearningJun-24-2024

Towards Bayesian Data Selection

Rodemann, Julian

A wide range of machine learning algorithms iteratively add data to the training sample. Examples include semi-supervised learning, active learning, multi-armed bandits, and Bayesian optimization. We embed this kind of data addition into decision theory by framing data selection as a decision problem. This paves the way for finding Bayes-optimal selections of data. For the illustrative case of self-training in semi-supervised learning, we derive the respective Bayes criterion. We further show that deploying this criterion mitigates the issue of confirmation bias by empirically assessing our method for generalized linear models, semi-parametric generalized additive models, and Bayesian neural networks on simulated and real-world data.

learning, selection, semi-supervised learning, (15 more...)

arXiv.org Machine Learning

2406.1256

Country:

Europe > Switzerland > Zürich > Zürich (0.04)
Europe > Germany > Bavaria > Upper Bavaria > Munich (0.04)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.90)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.68)