AITopics | Bayesian Learning

Collaborating Authors

Bayesian Learning

A Bayesian network, Bayes network, belief network, Bayes(ian) model or probabilistic directed acyclic graphical model is a probabilistic graphical model (a type of statistical model) that represents a set of variables and their conditional dependencies via a directed acyclic graph (DAG). (Wikipedia)

News Overviews Instructional Materials AI-Alerts Classics

Multi-Agent Causal Discovery Using Large Language Models

Le, Hao Duong, Xia, Xin, Chen, Zhang

arXiv.org Artificial IntelligenceJul-21-2024

Large Language Models (LLMs) have demonstrated significant potential in causal discovery tasks by utilizing their vast expert knowledge from extensive text corpora. However, the multi-agent capabilities of LLMs in causal discovery remain underexplored. This paper introduces a general framework to investigate this potential. The first is the Meta Agents Model, which relies exclusively on reasoning and discussions among LLM agents to conduct causal discovery. The second is the Coding Agents Model, which leverages the agents' ability to plan, write, and execute code, utilizing advanced statistical libraries for causal discovery. The third is the Hybrid Model, which integrates both the Meta Agents Model and Coding Agents Model approaches, combining the statistical analysis and reasoning skills of multiple agents. Our proposed framework shows promising results by effectively utilizing LLMs' expert knowledge, reasoning capabilities, multi-agent cooperation, and statistical causal methods. By exploring the multi-agent potential of LLMs, we aim to establish a foundation for further research in utilizing LLMs multi-agent for solving causal-related problems.

causal relationship, direct causal relationship, displacement, (15 more...)

arXiv.org Artificial Intelligence

2407.15073

Genre:

Workflow (1.00)
Research Report > New Finding (1.00)

Industry: Health & Medicine (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.67)

Add feedback

Enhancing Microgrid Performance Prediction with Attention-based Deep Learning Models

Maddineni, Vinod Kumar, Koganti, Naga Babu, Damacharla, Praveen

arXiv.org Artificial IntelligenceJul-20-2024

In this research, an effort is made to address microgrid systems' operational challenges, characterized by power oscillations that eventually contribute to grid instability. An integrated strategy is proposed, leveraging the strengths of convolutional and Gated Recurrent Unit (GRU) layers. This approach is aimed at effectively extracting temporal data from energy datasets to improve the precision of microgrid behavior forecasts. Additionally, an attention layer is employed to underscore significant features within the time-series data, optimizing the forecasting process. The framework is anchored by a Multi-Layer Perceptron (MLP) model, which is tasked with comprehensive load forecasting and the identification of abnormal grid behaviors. Our methodology underwent rigorous evaluation using the Micro-grid Tariff Assessment Tool dataset, with Root Mean Square Error (RMSE), Mean Absolute Error (MAE), and the coefficient of determination (r2-score) serving as the primary metrics. The approach demonstrated exemplary performance, evidenced by a MAE of 0.39, RMSE of 0.28, and an r2-score of 98.89\% in load forecasting, along with near-perfect zero state prediction accuracy (approximately 99.9\%). Significantly outperforming conventional machine learning models such as support vector regression and random forest regression, our model's streamlined architecture is particularly suitable for real-time applications, thereby facilitating more effective and reliable microgrid management.

accuracy, load forecasting, prediction, (12 more...)

arXiv.org Artificial Intelligence

2407.14984

Country:

Africa > Sub-Saharan Africa (0.04)
Africa > Southern Africa (0.04)
North America > United States > Texas > Montgomery County > The Woodlands (0.04)
(5 more...)

Genre: Research Report (0.64)

Industry:

Energy > Renewable > Solar (0.94)
Energy > Power Industry > Utilities (0.93)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Add feedback

Is Behavior Cloning All You Need? Understanding Horizon in Imitation Learning

Foster, Dylan J., Block, Adam, Misra, Dipendra

arXiv.org Machine LearningJul-20-2024

Imitation learning (IL) aims to mimic the behavior of an expert in a sequential decision making task by learning from demonstrations, and has been widely applied to robotics, autonomous driving, and autoregressive text generation. The simplest approach to IL, behavior cloning (BC), is thought to incur sample complexity with unfavorable quadratic dependence on the problem horizon, motivating a variety of different online algorithms that attain improved linear horizon dependence under stronger assumptions on the data and the learner's access to the expert. We revisit the apparent gap between offline and online IL from a learning-theoretic perspective, with a focus on general policy classes up to and including deep neural networks. Through a new analysis of behavior cloning with the logarithmic loss, we show that it is possible to achieve horizon-independent sample complexity in offline IL whenever (i) the range of the cumulative payoffs is controlled, and (ii) an appropriate notion of supervised learning complexity for the policy class is controlled. Specializing our results to deterministic, stationary policies, we show that the gap between offline and online IL is not fundamental: (i) it is possible to achieve linear dependence on horizon in offline IL under dense rewards (matching what was previously only known to be achievable in online IL); and (ii) without further assumptions on the policy class, online IL cannot improve over offline IL with the logarithmic loss, even in benign MDPs. We complement our theoretical results with experiments on standard RL tasks and autoregressive language generation to validate the practical relevance of our findings.

algorithm, imitation, sample complexity, (14 more...)

arXiv.org Machine Learning

2407.15007

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > Middle East > Jordan (0.04)
North America > United States > New York > New York County > New York City (0.04)
North America > United States > Illinois > Cook County > Chicago (0.04)

Genre: Research Report > New Finding (1.00)

Industry:

Education > Educational Setting > Online (0.54)
Transportation > Ground > Road (0.47)
Information Technology > Robotics & Automation (0.34)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
(2 more...)

Add feedback

Hyperspectral Unmixing Under Endmember Variability: A Variational Inference Framework

Li, Yuening, Fu, Xiao, Liu, Junbin, Ma, Wing-Kin

arXiv.org Artificial IntelligenceJul-20-2024

This work proposes a variational inference (VI) framework for hyperspectral unmixing in the presence of endmember variability (HU-EV). An EV-accounted noisy linear mixture model (LMM) is considered, and the presence of outliers is also incorporated into the model. Following the marginalized maximum likelihood (MML) principle, a VI algorithmic structure is designed for probabilistic inference for HU-EV. Specifically, a patch-wise static endmember assumption is employed to exploit spatial smoothness and to try to overcome the ill-posed nature of the HU-EV problem. The design facilitates lightweight, continuous optimization-based updates under a variety of endmember priors. Some of the priors, such as the Beta prior, were previously used under computationally heavy, sampling-based probabilistic HU-EV methods. The effectiveness of the proposed framework is demonstrated through synthetic, semi-real, and real-data experiments.

endmember, helen, pixel, (15 more...)

arXiv.org Artificial Intelligence

2407.14899

Country:

Asia > China > Hong Kong (0.04)
North America > United States > Oregon (0.04)
North America > United States > California > Los Angeles County > Santa Monica (0.04)
North America > United States > California > Los Angeles County > Pasadena (0.04)

Genre: Research Report (0.50)

Industry: Government > Regional Government > North America Government > United States Government (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.48)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.34)

Add feedback

Causal Inference with Complex Treatments: A Survey

Wang, Yingrong, Li, Haoxuan, Zhu, Minqin, Wu, Anpeng, Xiong, Ruoxuan, Wu, Fei, Kuang, Kun

arXiv.org Artificial IntelligenceJul-19-2024

Causal inference plays an important role in explanatory analysis and decision making across various fields like statistics, marketing, health care, and education. Its main task is to estimate treatment effects and make intervention policies. Traditionally, most of the previous works typically focus on the binary treatment setting that there is only one treatment for a unit to adopt or not. However, in practice, the treatment can be much more complex, encompassing multi-valued, continuous, or bundle options. In this paper, we refer to these as complex treatments and systematically and comprehensively review the causal inference methods for addressing them. First, we formally revisit the problem definition, the basic assumptions, and their possible variations under specific conditions. Second, we sequentially review the related methods for multi-valued, continuous, and bundled treatment settings. In each situation, we tentatively divide the methods into two categories: those conforming to the unconfoundedness assumption and those violating it. Subsequently, we discuss the available datasets and open-source codes. Finally, we provide a brief summary of these works and suggest potential directions for future research.

confounder, publication date, treatment effect, (14 more...)

arXiv.org Artificial Intelligence

2407.14022

Country:

North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.14)
North America > United States > California > San Francisco County > San Francisco (0.14)
Asia > China > Zhejiang Province > Hangzhou (0.04)
(20 more...)

Genre:

Research Report > Experimental Study (1.00)
Overview (1.00)
Research Report > Strength High (0.67)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Health & Medicine > Therapeutic Area > Oncology (0.67)
Health & Medicine > Therapeutic Area > Psychiatry/Psychology (0.45)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Data Science (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.92)
(2 more...)

Add feedback

Quantifying the value of positive transfer: An experimental case study

Hughes, Aidan J., Delo, Giulia, Poole, Jack, Dervilis, Nikolaos, Worden, Keith

arXiv.org Artificial IntelligenceJul-19-2024

In traditional approaches to structural health monitoring, challenges often arise associated with the availability of labelled data. Population-based structural health monitoring seeks to overcomes these challenges by leveraging data/information from similar structures via technologies such as transfer learning. The current paper demonstrate a methodology for quantifying the value of information transfer in the context of operation and maintenance decision-making. This demonstration, based on a population of laboratory-scale aircraft models, highlights the steps required to evaluate the expected value of information transfer including similarity assessment and prediction of transfer efficacy. Once evaluated for a given population, the value of information transfer can be used to optimise transfer-learning strategies for newly-acquired target domains.

case study, correspond, target domain, (13 more...)

arXiv.org Artificial Intelligence

2407.14342

Country:

Europe > United Kingdom (0.14)
Europe > Italy > Piedmont > Turin Province > Turin (0.04)

Genre: Research Report (0.50)

Industry: Health & Medicine > Consumer Health (0.56)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Add feedback

Downstream-Pretext Domain Knowledge Traceback for Active Learning

Zhang, Beichen, Li, Liang, Zha, Zheng-Jun, Luo, Jiebo, Huang, Qingming

arXiv.org Artificial IntelligenceJul-19-2024

Active learning (AL) is designed to construct a high-quality labeled dataset by iteratively selecting the most informative samples. Such sampling heavily relies on data representation, while recently pre-training is popular for robust feature learning. However, as pre-training utilizes low-level pretext tasks that lack annotation, directly using pre-trained representation in AL is inadequate for determining the sampling score. To address this problem, we propose a downstream-pretext domain knowledge traceback (DOKT) method that traces the data interactions of downstream knowledge and pre-training guidance for selecting diverse and instructive samples near the decision boundary. DOKT consists of a traceback diversity indicator and a domain-based uncertainty estimator. The diversity indicator constructs two feature spaces based on the pre-training pretext model and the downstream knowledge from annotation, by which it locates the neighbors of unlabeled data from the downstream space in the pretext space to explore the interaction of samples. With this mechanism, DOKT unifies the data relations of low-level and high-level representations to estimate traceback diversity. Next, in the uncertainty estimator, domain mixing is designed to enforce perceptual perturbing to unlabeled samples with similar visual patches in the pretext space. Then the divergence of perturbed samples is measured to estimate the domain uncertainty. As a result, DOKT selects the most diverse and important samples based on these two modules. The experiments conducted on ten datasets show that our model outperforms other state-of-the-art methods and generalizes well to various application scenarios such as semantic segmentation and image captioning.

active learning, dokt, learning, (14 more...)

arXiv.org Artificial Intelligence

doi: 10.1109/TMM.2024.3391897

2407.1472

Country:

Asia > China > Beijing > Beijing (0.04)
North America > United States > Wisconsin (0.04)
North America > United States > New York > Monroe County > Rochester (0.04)
Asia > China > Anhui Province > Hefei (0.04)

Genre: Research Report > Promising Solution (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Add feedback

A deep latent variable model for semi-supervised multi-unit soft sensing in industrial processes

Grimstad, Bjarne, Løvland, Kristian, Imsland, Lars S., Gunnerud, Vidar

arXiv.org Artificial IntelligenceJul-18-2024

In many industrial processes, an apparent lack of data limits the development of data-driven soft sensors. There are, however, often opportunities to learn stronger models by being more data-efficient. To achieve this, one can leverage knowledge about the data from which the soft sensor is learned. Taking advantage of properties frequently possessed by industrial data, we introduce a deep latent variable model for semi-supervised multi-unit soft sensing. This hierarchical, generative model is able to jointly model different units, as well as learning from both labeled and unlabeled data. An empirical study of multi-unit soft sensing is conducted using two datasets: a synthetic dataset of single-phase fluid flow, and a large, real dataset of multi-phase flow in oil and gas wells. We show that by combining semi-supervised and multi-task learning, the proposed model achieves superior results, outperforming current leading methods for this soft sensing problem. We also show that when a model has been trained on a multi-unit dataset, it may be finetuned to previously unseen units using only a handful of data points. In this finetuning procedure, unlabeled data improve soft sensor performance; remarkably, this is true even when no labeled data are available.

artificial intelligence, machine learning, unlabeled data, (15 more...)

arXiv.org Artificial Intelligence

2407.1331

Country: Europe > Norway (0.28)

Genre: Research Report (0.40)

Industry: Energy > Oil & Gas > Upstream (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.67)

Add feedback

Dimension-reduced Reconstruction Map Learning for Parameter Estimation in Likelihood-Free Inference Problems

Zhang, Rui, Chkrebtii, Oksana A., Xiu, Dongbin

arXiv.org Machine LearningJul-18-2024

Many application areas rely on models that can be readily simulated but lack a closed-form likelihood, or an accurate approximation under arbitrary parameter values. Existing parameter estimation approaches in this setting are generally approximate. Recent work on using neural network models to reconstruct the mapping from the data space to the parameters from a set of synthetic parameter-data pairs suffers from the curse of dimensionality, resulting in inaccurate estimation as the data size grows. We propose a dimension-reduced approach to likelihood-free estimation which combines the ideas of reconstruction map estimation with dimension-reduction approaches based on subject-specific knowledge. We examine the properties of reconstruction map estimation with and without dimension reduction and explore the trade-off between approximation error due to information loss from reducing the data dimension and approximation error. Numerical examples show that the proposed approach compares favorably with reconstruction map estimation, approximate Bayesian computation, and synthetic likelihood estimation.

estimation, estimator, likelihood, (16 more...)

arXiv.org Machine Learning

2407.13971

Country:

North America > United States > Ohio (0.04)
North America > United States > North Carolina > Durham County > Durham (0.04)
North America > United States > New York (0.04)
(2 more...)

Genre:

Instructional Material > Course Syllabus & Notes (0.46)
Research Report (0.40)

Industry: Health & Medicine (0.93)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
(3 more...)

Add feedback

Byzantine-tolerant distributed learning of finite mixture models

Zhang, Qiong, Chen, Jiahua

arXiv.org Machine LearningJul-18-2024

This paper proposes two split-and-conquer (SC) learning estimators for finite mixture models that are tolerant to Byzantine failures. In SC learning, individual machines obtain local estimates, which are then transmitted to a central server for aggregation. During this communication, the server may receive malicious or incorrect information from some local machines, a scenario known as Byzantine failures. While SC learning approaches have been devised to mitigate Byzantine failures in statistical models with Euclidean parameters, developing Byzantine-tolerant methods for finite mixture models with non-Euclidean parameters requires a distinct strategy. Our proposed distance-based methods are hyperparameter tuning free, unlike existing methods, and are resilient to Byzantine failures while achieving high statistical efficiency. We validate the effectiveness of our methods both theoretically and empirically via experiments on simulated and real data from machine learning applications for digit recognition. The code for the experiment can be found at https://github.com/SarahQiong/RobustSCGMM.

byzantine failure, estimator, mixture model, (16 more...)

arXiv.org Machine Learning

2407.1398

Country:

Asia > Middle East > Jordan (0.04)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > China > Beijing > Beijing (0.04)

Genre: Research Report (0.82)

Industry: Health & Medicine > Therapeutic Area (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Data Science (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)
(2 more...)

Add feedback