AITopics | Bayesian Learning

Collaborating Authors

Bayesian Learning

A Bayesian network, Bayes network, belief network, Bayes(ian) model or probabilistic directed acyclic graphical model is a probabilistic graphical model (a type of statistical model) that represents a set of variables and their conditional dependencies via a directed acyclic graph (DAG). (Wikipedia)

News Overviews Instructional Materials AI-Alerts Classics

A comparative study of conformal prediction methods for valid uncertainty quantification in machine learning

Dewolf, Nicolas

arXiv.org Machine LearningMay-3-2024

In the past decades, most work in the area of data analysis and machine learning was focused on optimizing predictive models and getting better results than what was possible with existing models. To what extent the metrics with which such improvements were measured were accurately capturing the intended goal, whether the numerical differences in the resulting values were significant, or whether uncertainty played a role in this study and if it should have been taken into account, was of secondary importance. Whereas probability theory, be it frequentist or Bayesian, used to be the gold standard in science before the advent of the supercomputer, it was quickly replaced in favor of black box models and sheer computing power because of their ability to handle large data sets. This evolution sadly happened at the expense of interpretability and trustworthiness. However, while people are still trying to improve the predictive power of their models, the community is starting to realize that for many applications it is not so much the exact prediction that is of importance, but rather the variability or uncertainty. The work in this dissertation tries to further the quest for a world where everyone is aware of uncertainty, of how important it is and how to embrace it instead of fearing it. A specific, though general, framework that allows anyone to obtain accurate uncertainty estimates is singled out and analysed. Certain aspects and applications of the framework -- dubbed `conformal prediction' -- are studied in detail. Whereas many approaches to uncertainty quantification make strong assumptions about the data, conformal prediction is, at the time of writing, the only framework that deserves the title `distribution-free'. No parametric assumptions have to be made and the nonparametric results also hold without having to resort to the law of large numbers in the asymptotic regime.

clusterwise average prediction, clusterwise representation complexity, conditional nonconformity distribution, (16 more...)

arXiv.org Machine Learning

2405.02082

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.13)
Asia > Middle East > Jordan (0.04)
Europe > Belgium > Flanders > East Flanders > Ghent (0.04)
(6 more...)

Genre:

Research Report > New Finding (1.00)
Overview (1.00)

Industry:

Transportation (1.00)
Health & Medicine (1.00)
Education > Educational Setting (1.00)

Technology:

Information Technology > Modeling & Simulation (1.00)
Information Technology > Data Science > Data Mining (1.00)
Information Technology > Communications (1.00)
(6 more...)

Add feedback

Modelling Sampling Distributions of Test Statistics with Autograd

Kadhim, Ali Al, Prosper, Harrison B.

arXiv.org Machine LearningMay-3-2024

Automatic differentiation (see, for example, Ref.[1]) has revolutionized machine learning, permitting the routine application of gradient descent algorithms to fit to data models of essentially unlimited complexity. The same technology can be used to take the derivative of these models with respect to their inputs without the need to explicitly calculate the derivatives [2]. A potentially useful application of this capability is approximating the probability density function (pdf), f(x | θ), given an accurate neural network model of the associated conditional cumulative distribution function (cdf), F (x | θ), using the fact that F (x | θ) f(x | θ) =, (1) x where θ are the parameters of the data-generation mechanism, which we distinguish from the parameters w of the neural network model. This paper explores this possibility in the context of simulation-based frequentist inference [3-7]. Equation (1) furnishes an approximation of the pdf f(x | θ) whether x is a function of the underlying observations D only or if x = λ(D; θ) is a test statistic that depends on D as well as on the parameters θ. Moreover, computing the derivative of the cdf using autograd to obtain the pdf is exact; autograd does not use finite difference approximations.

approximation, cdf, neural network, (15 more...)

arXiv.org Machine Learning

2405.02488

Country:

North America > United States > Florida > Leon County > Tallahassee (0.04)
Europe > Italy > Sardinia (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Add feedback

Wildfire Risk Prediction: A Review

Xu, Zhengsen, Li, Jonathan, Xu, Linlin

arXiv.org Artificial IntelligenceMay-2-2024

Wildfires have significant impacts on global vegetation, wildlife, and humans. They destroy plant communities and wildlife habitats and contribute to increased emissions of carbon dioxide, nitrogen oxides, methane, and other pollutants. The prediction of wildfires relies on various independent variables combined with regression or machine learning methods. In this technical review, we describe the options for independent variables, data processing techniques, models, independent variables collinearity and importance estimation methods, and model performance evaluation metrics. First, we divide the independent variables into 4 aspects, including climate and meteorology conditions, socio-economical factors, terrain and hydrological features, and wildfire historical records. Second, preprocessing methods are described for different magnitudes, different spatial-temporal resolutions, and different formats of data. Third, the collinearity and importance evaluation methods of independent variables are also considered. Fourth, we discuss the application of statistical models, traditional machine learning models, and deep learning models in wildfire risk prediction. In this subsection, compared with other reviews, this manuscript particularly discusses the evaluation metrics and recent advancements in deep learning methods. Lastly, addressing the limitations of current research, this paper emphasizes the need for more effective deep learning time series forecasting algorithms, the utilization of three-dimensional data including ground and trunk fuel, extraction of more accurate historical fire point data, and improved model evaluation metrics.

artificial intelligence, machine learning, prediction, (16 more...)

arXiv.org Artificial Intelligence

2405.01607

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.45)

Add feedback

TextAge: A Curated and Diverse Text Dataset for Age Classification

Cheekati, Shravan, Gupta, Mridul, Raghu, Vibha, Raj, Pranav

arXiv.org Artificial IntelligenceMay-2-2024

Age-related language patterns play a crucial role in understanding linguistic differences and developing age-appropriate communication strategies. However, the lack of comprehensive and diverse datasets has hindered the progress of research in this area. To address this issue, we present TextAge, a curated text dataset that maps sentences to the age and age group of the producer, as well as an underage (under 13) label. TextAge covers a wide range of ages and includes both spoken and written data from various sources such as CHILDES, Meta, Poki Poems-by-kids, JUSThink, and the TV show "Survivor." The dataset undergoes extensive cleaning and preprocessing to ensure data quality and consistency. We demonstrate the utility of TextAge through two applications: Underage Detection and Generational Classification. For Underage Detection, we train a Naive Bayes classifier, fine-tuned RoBERTa, and XLNet models to differentiate between language patterns of minors and young-adults and over. For Generational Classification, the models classify language patterns into different age groups (kids, teens, twenties, etc.). The models excel at classifying the "kids" group but struggle with older age groups, particularly "fifties," "sixties," and "seventies," likely due to limited data samples and less pronounced linguistic differences. TextAge offers a valuable resource for studying age-related language patterns and developing age-sensitive language models. The dataset's diverse composition and the promising results of the classification tasks highlight its potential for various applications, such as content moderation, targeted advertising, and age-appropriate communication. Future work aims to expand the dataset further and explore advanced modeling techniques to improve performance on older age groups.

age group, dataset, language pattern, (11 more...)

arXiv.org Artificial Intelligence

2406.1689

Genre: Research Report (0.64)

Industry:

Education (0.48)
Media > Television (0.35)
Leisure & Entertainment (0.35)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.93)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.73)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.58)

Add feedback

A probabilistic estimation of remaining useful life from censored time-to-event data

Lillelund, Christian Marius, Pannullo, Fernando, Jakobsen, Morten Opprud, Morante, Manuel, Pedersen, Christian Fischer

arXiv.org Artificial IntelligenceMay-2-2024

Predicting the remaining useful life (RUL) of ball bearings plays an important role in predictive maintenance. A common definition of the RUL is the time until a bearing is no longer functional, which we denote as an event, and many data-driven methods have been proposed to predict the RUL. However, few studies have addressed the problem of censored data, where this event of interest is not observed, and simply ignoring these observations can lead to an overestimation of the failure risk. In this paper, we propose a probabilistic estimation of RUL using survival analysis that supports censored data. First, we analyze sensor readings from ball bearings in the frequency domain and annotate when a bearing starts to deteriorate by calculating the Kullback-Leibler (KL) divergence between the probability density function (PDF) of the current process and a reference PDF. Second, we train several survival models on the annotated bearing dataset, capable of predicting the RUL over a finite time horizon using the survival function. This function is guaranteed to be strictly monotonically decreasing and is an intuitive estimation of the remaining lifetime. We demonstrate our approach in the XJTU-SY dataset using cross-validation and find that Random Survival Forests consistently outperforms both non-neural networks and neural networks in terms of the mean absolute error (MAE). Our work encourages the inclusion of censored data in predictive maintenance models and highlights the unique advantages that survival analysis offers when it comes to probabilistic RUL estimation and early fault detection.

bearing, dataset, prediction, (17 more...)

arXiv.org Artificial Intelligence

2405.01614

Country:

Europe > Denmark (0.04)
Asia > China > Shaanxi Province > Xi'an (0.04)

Genre: Research Report > Experimental Study (0.46)

Industry:

Law > Civil Rights & Constitutional Law (1.00)
Health & Medicine (1.00)
Energy (0.92)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.67)

Add feedback

Uncertainty for Active Learning on Graphs

Fuchsgruber, Dominik, Wollschläger, Tom, Charpentier, Bertrand, Oroz, Antonio, Günnemann, Stephan

arXiv.org Artificial IntelligenceMay-2-2024

Uncertainty Sampling is an Active Learning strategy that aims to improve the data efficiency of machine learning models by iteratively acquiring labels of data points with the highest uncertainty. While it has proven effective for independent data its applicability to graphs remains under-explored. We propose the first extensive study of Uncertainty Sampling for node classification: (1) We benchmark Uncertainty Sampling beyond predictive uncertainty and highlight a significant performance gap to other Active Learning strategies. (2) We develop ground-truth Bayesian uncertainty estimates in terms of the data generating process and prove their effectiveness in guiding Uncertainty Sampling toward optimal queries. We confirm our results on synthetic data and design an approximate approach that consistently outperforms other uncertainty estimators on real datasets. (3) Based on this analysis, we relate pitfalls in modeling uncertainty to existing methods. Our analysis enables and informs the development of principled uncertainty estimation on graphs.

active learning, classifier, graph, (13 more...)

arXiv.org Artificial Intelligence

2405.01462

Country:

North America > United States (0.14)
Europe > Germany > Bavaria > Upper Bavaria > Munich (0.04)
Europe > France > Île-de-France > Paris > Paris (0.04)

Genre: Research Report > New Finding (0.87)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

ALCM: Autonomous LLM-Augmented Causal Discovery Framework

Khatibi, Elahe, Abbasian, Mahyar, Yang, Zhongqi, Azimi, Iman, Rahmani, Amir M.

arXiv.org Artificial IntelligenceMay-2-2024

To perform effective causal inference in high-dimensional datasets, initiating the process with causal discovery is imperative, wherein a causal graph is generated based on observational data. However, obtaining a complete and accurate causal graph poses a formidable challenge, recognized as an NP-hard problem. Recently, the advent of Large Language Models (LLMs) has ushered in a new era, indicating their emergent capabilities and widespread applicability in facilitating causal reasoning across diverse domains, such as medicine, finance, and science. The expansive knowledge base of LLMs holds the potential to elevate the field of causal reasoning by offering interpretability, making inferences, generalizability, and uncovering novel causal structures. In this paper, we introduce a new framework, named Autonomous LLM-Augmented Causal Discovery Framework (ALCM), to synergize data-driven causal discovery algorithms and LLMs, automating the generation of a more resilient, accurate, and explicable causal graph. The ALCM consists of three integral components: causal structure learning, causal wrapper, and LLM-driven causal refiner. These components autonomously collaborate within a dynamic environment to address causal discovery questions and deliver plausible causal graphs. We evaluate the ALCM framework by implementing two demonstrations on seven well-known datasets. Experimental results demonstrate that ALCM outperforms existing LLM methods and conventional data-driven causal reasoning mechanisms. This study not only shows the effectiveness of the ALCM but also underscores new research directions in leveraging the causal reasoning capabilities of LLMs.

algorithm, causal discovery algorithm, discovery algorithm, (12 more...)

arXiv.org Artificial Intelligence

2405.01744

Country:

Asia (0.04)
North America > United States > California > Orange County > Irvine (0.04)

Genre: Research Report > New Finding (0.68)

Industry: Health & Medicine > Therapeutic Area > Oncology (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)

Add feedback

Error-Driven Uncertainty Aware Training

Mendes, Pedro, Romano, Paolo, Garlan, David

arXiv.org Artificial IntelligenceMay-2-2024

Neural networks are often overconfident about their predictions, which undermines their reliability and trustworthiness. In this work, we present a novel technique, named Error-Driven Uncertainty Aware Training (EUAT), which aims to enhance the ability of neural models to estimate their uncertainty correctly, namely to be highly uncertain when they output inaccurate predictions and low uncertain when their output is accurate. The EUAT approach operates during the model's training phase by selectively employing two loss functions depending on whether the training examples are correctly or incorrectly predicted by the model. This allows for pursuing the twofold goal of i) minimizing model uncertainty for correctly predicted inputs and ii) maximizing uncertainty for mispredicted inputs, while preserving the model's misprediction rate. We evaluate EUAT using diverse neural models and datasets in the image recognition domains considering both non-adversarial and adversarial settings. The results show that EUAT outperforms existing approaches for uncertainty estimation (including other uncertainty-aware training techniques, calibration, ensembles, and DEUP) by providing uncertainty estimates that not only have higher quality when evaluated via statistical metrics (e.g., correlation with residuals) but also when employed to build binary classifiers that decide whether the model's output can be trusted or not and under distributional data shifts.

baseline, euat, prediction, (14 more...)

arXiv.org Artificial Intelligence

2405.01205

Country:

North America > Canada > Ontario > Toronto (0.14)
North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
North America > United States > New York > New York County > New York City (0.04)
(3 more...)

Genre:

Research Report > Promising Solution (0.48)
Research Report > New Finding (0.48)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Add feedback

Heterogeneous network and graph attention auto-encoder for LncRNA-disease association prediction

Liu, Jin-Xing, Xi, Wen-Yu, Dai, Ling-Yun, Zheng, Chun-Hou, Gao, Ying-Lian

arXiv.org Artificial IntelligenceMay-2-2024

The emerging research shows that lncRNAs are associated with a series of complex human diseases. However, most of the existing methods have limitations in identifying nonlinear lncRNA-disease associations (LDAs), and it remains a huge challenge to predict new LDAs. Therefore, the accurate identification of LDAs is very important for the warning and treatment of diseases. In this work, multiple sources of biomedical data are fully utilized to construct characteristics of lncRNAs and diseases, and linear and nonlinear characteristics are effectively integrated. Furthermore, a novel deep learning model based on graph attention automatic encoder is proposed, called HGATELDA. To begin with, the linear characteristics of lncRNAs and diseases are created by the miRNA-lncRNA interaction matrix and miRNA-disease interaction matrix. Following this, the nonlinear features of diseases and lncRNAs are extracted using a graph attention auto-encoder, which largely retains the critical information and effectively aggregates the neighborhood information of nodes. In the end, LDAs can be predicted by fusing the linear and nonlinear characteristics of diseases and lncRNA. The HGATELDA model achieves an impressive AUC value of 0.9692 when evaluated using a 5-fold cross-validation indicating its superior performance in comparison to several recent prediction models. Meanwhile, the effectiveness of HGATELDA in identifying novel LDAs is further demonstrated by case studies. the HGATELDA model appears to be a viable computational model for predicting LDAs.

bioinformatics, lncrna, prediction, (15 more...)

arXiv.org Artificial Intelligence

2405.02354

Country:

Asia > China > Shandong Province > Qingdao (0.04)
Europe > United Kingdom (0.04)
Europe > Ireland (0.04)
(3 more...)

Genre: Research Report > New Finding (0.93)

Industry:

Health & Medicine > Therapeutic Area > Oncology (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Add feedback

Multivariate Bayesian Last Layer for Regression: Uncertainty Quantification and Disentanglement

Wang, Han, Kawasaki, Eiji, Damblin, Guillaume, Daniel, Geoffrey

arXiv.org Machine LearningMay-2-2024

We present new Bayesian Last Layer models in the setting of multivariate regression under heteroscedastic noise, and propose an optimization algorithm for parameter learning. Bayesian Last Layer combines Bayesian modelling of the predictive distribution with neural networks for parameterization of the prior, and has the attractive property of uncertainty quantification with a single forward pass. The proposed framework is capable of disentangling the aleatoric and epistemic uncertainty, and can be used to transfer a canonically trained deep neural network to new data domains with uncertainty-aware capability.

epistemic uncertainty, matrix, xx 1, (13 more...)

arXiv.org Machine Learning

2405.01761

Country:

Europe > France (0.04)
Asia > China > Beijing > Beijing (0.04)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.49)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.48)

Add feedback