AITopics

2405.20918

Country:

Europe > Germany > Baden-Württemberg > Tübingen Region > Tübingen (0.14)
Asia > India > Tamil Nadu (0.04)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)

Genre: Research Report (0.82)

Technology:

Information Technology > Data Science > Data Mining (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.66)

van Leeuwen, Peter Jan, Chiu, J. Christine, Yang, C. Kevin

Uncertainty Quantification for Deep Learning

arXiv.org Machine LearningMay-30-2024

A complete and statistically consistent uncertainty quantification for deep learning is provided, including the sources of uncertainty arising from (1) the new input data, (2) the training and testing data (3) the weight vectors of the neural network, and (4) the neural network because it is not a perfect predictor. Using Bayes Theorem and conditional probability densities, we demonstrate how each uncertainty source can be systematically quantified. We also introduce a fast and practical way to incorporate and combine all sources of errors for the first time. For illustration, the new method is applied to quantify errors in cloud autoconversion rates, predicted from an artificial neural network that was trained by aircraft cloud probe measurements in the Azores and the stochastic collection equation formulated as a two-moment bin model. For this specific example, the output uncertainty arising from uncertainty in the training and testing data is dominant, followed by uncertainty in the input data, in the trained neural network, and uncertainty in the weights. We discuss the usefulness of the methodology for machine learning practice, and how, through inclusion of uncertainty in the training data, the new methodology is less sensitive to input data that falls outside of the training data set.

training data, vector, weight vector, (15 more...)

2405.2055

Country:

Europe > Portugal > Azores (0.24)
North America > United States > New York > New York County > New York City (0.04)
North America > United States > Colorado > Larimer County > Fort Collins (0.04)
(3 more...)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.86)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.86)

Wei, Chengwei, Koo, Kee Kiat, Tavanaei, Amir, Bouyarmane, Karim

Confidence-Aware Sub-Structure Beam Search (CABS): Mitigating Hallucination in Structured Data Generation with Large Language Models

Large Language Models (LLMs) have facilitated structured data generation, with applications in domains like tabular data, document databases, product catalogs, etc. However, concerns persist about generation veracity due to incorrect references or hallucinations, necessitating the incorporation of some form of model confidence for mitigation. Existing confidence estimation methods on LLM generations primarily focus on the confidence at the individual token level or the entire output sequence level, limiting their applicability to structured data generation, which consists of an intricate mix of both independent and correlated entries at the sub-structure level. In this paper, we first investigate confidence estimation methods for generated sub-structure-level data. We introduce the concept of Confidence Network that applies on the hidden state of the LLM transformer, as a more targeted estimate than the traditional token conditional probability. We further propose Confidence-Aware sub-structure Beam Search (CABS), a novel decoding method operating at the sub-structure level in structured data generation. CABS enhances the faithfulness of structured data generation by considering confidence scores from the Confidence Network for each sub-structure-level data and iteratively refining the prompts. Results show that CABS outperforms traditional token-level beam search for structured data generation by 16.7% Recall at 90% precision averagely on the problem of product attribute generation.

confidence network, data generation, sub-structure sequence, (11 more...)

2406.00069

Country: North America > United States > California (0.14)

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.37)

Deep Modeling of Non-Gaussian Aleatoric Uncertainty

Acharya, Aastha, Lee, Caleb, D'Alonzo, Marissa, Shamwell, Jared, Ahmed, Nisar R., Russell, Rebecca

Deep learning offers promising new ways to accurately model aleatoric uncertainty in robotic estimation systems, particularly when the uncertainty distributions do not conform to traditional assumptions of being fixed and Gaussian. In this study, we formulate and evaluate three fundamental deep learning approaches for conditional probability density modeling to quantify non-Gaussian aleatoric uncertainty: parametric, discretized, and generative modeling. We systematically compare the respective strengths and weaknesses of these three methods on simulated non-Gaussian densities as well as on real-world terrain-relative navigation data. Our results show that these deep learning methods can accurately capture complex uncertainty patterns, highlighting their potential for improving the reliability and robustness of estimation systems.

density estimation, estimation, uncertainty model, (16 more...)

2405.20513

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
North America > United States > Colorado > Boulder County > Boulder (0.04)

Genre: Research Report > New Finding (1.00)

Industry: Government > Regional Government > North America Government > United States Government (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.48)

Massari, Hakim El, Gherabi, Noreddine, Mhammedi, Sajida, Ghandi, Hamza, Bahaj, Mohamed, Naqvi, Muhammad Raza

The Impact of Ontology on the Prediction of Cardiovascular Disease Compared to Machine Learning Algorithms

Cardiovascular disease is one of the chronic diseases that is on the rise. The complications occur when cardiovascular disease is not discovered early and correctly diagnosed at the right time. Various machine learning approaches, including ontology-based Machine Learning techniques, have lately played an essential role in medical science by building an automated system that can identify heart illness. This paper compares and reviews the most prominent machine learning algorithms, as well as ontology-based Machine Learning classification. Random Forest, Logistic regression, Decision Tree, Naive Bayes, k-Nearest Neighbours, Artificial Neural Network, and Support Vector Machine were among the classification methods explored. The dataset used consists of 70000 instances and can be downloaded from the Kaggle website. The findings are assessed using performance measures generated from the confusion matrix, such as F-Measure, Accuracy, Recall, and Precision. The results showed that the ontology outperformed all the machine learning algorithms.

algorithm, cardiovascular disease, prediction, (12 more...)

doi: 10.3991/ijoe.v18i11.32647

2405.20414

Country:

Asia > Middle East > Iraq (0.05)
Europe > France > Occitanie > Haute-Garonne > Toulouse (0.04)
Africa > Middle East > Morocco > Béni Mellal-Khénifra Region > Beni Mellal (0.04)

Genre: Research Report > New Finding (1.00)

Industry: Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Ontologies (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
(2 more...)

Exploring Key Factors for Long-Term Vessel Incident Risk Prediction

Chen, Tianyi, Wang, Hua, Cai, Yutong, Liang, Maohan, Meng, Qiang

Most previous studies conduct factor analysis within the framework of incident-related label prediction, where the developed models can be categorized into short-term and long-term prediction models. The long-term models offer a more strategic approach, enabling more proactive risk management, compared to the short-term ones. Nevertheless, few studies have devoted to rigorously identifying the key factors for the long-term prediction and undertaking comprehensive factor analysis. Hence, this study aims to delve into the key factors for predicting the incident risk levels in the subsequent year given a specific datestamp. The majority of candidate factors potentially contributing to the incident risk are collected from vessels' historical safety performance data spanning up to five years. An improved embedded feature selection method, which integrates Random Forest classifier with a feature filtering process, is proposed to identify key risk-contributing factors from the candidate pool. A dataset with information of 131,652 vessels collected from 2015 to 2023 is utilized for case study. The results demonstrate superior performances of the proposed method in incident prediction and factor interpretability. Comprehensive analysis is conducted upon the key factors, which could help maritime stakeholders formulate management strategies for incident prevention.

category, deficiency, incident, (12 more...)

2405.19804

Country:

Asia > Singapore > Central Region > Singapore (0.04)
Asia > China > Anhui Province > Hefei (0.04)
South America > Chile (0.04)
(4 more...)

Genre: Research Report > New Finding (0.88)

Industry: Transportation > Marine (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.68)

Messenger, Daniel, Dwyer, Greg, Dukic, Vanja

Weak-Form Inference for Hybrid Dynamical Systems in Ecology

Species subject to predation and environmental threats commonly exhibit variable periods of population boom and bust over long timescales. Understanding and predicting such behavior, especially given the inherent heterogeneity and stochasticity of exogenous driving factors over short timescales, is an ongoing challenge. A modeling paradigm gaining popularity in the ecological sciences for such multi-scale effects is to couple short-term continuous dynamics to long-term discrete updates. We develop a data-driven method utilizing weak-form equation learning to extract such hybrid governing equations for population dynamics and to estimate the requisite parameters using sparse intermittent measurements of the discrete and continuous variables. The method produces a set of short-term continuous dynamical system equations parametrized by long-term variables, and long-term discrete equations parametrized by short-term variables, allowing direct assessment of interdependencies between the two time scales. We demonstrate the utility of the method on a variety of ecological scenarios and provide extensive tests using models previously derived for epizootics experienced by the North American spongy moth (Lymantria dispar dispar).

equation, identification, noise, (15 more...)

2405.20591

Country:

North America > United States > Colorado > Boulder County > Boulder (0.14)
North America > Canada > British Columbia (0.14)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
(3 more...)

Genre: Research Report > Experimental Study (0.46)

Industry: Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (0.93)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.94)
(3 more...)

Estimating before Debiasing: A Bayesian Approach to Detaching Prior Bias in Federated Semi-Supervised Learning

Zhu, Guogang, Liu, Xuefeng, Wu, Xinghao, Tang, Shaojie, Tang, Chao, Niu, Jianwei, Su, Hao

Federated Semi-Supervised Learning (FSSL) leverages both labeled and unlabeled data on clients to collaboratively train a model.In FSSL, the heterogeneous data can introduce prediction bias into the model, causing the model's prediction to skew towards some certain classes. Existing FSSL methods primarily tackle this issue by enhancing consistency in model parameters or outputs. However, as the models themselves are biased, merely constraining their consistency is not sufficient to alleviate prediction bias. In this paper, we explore this bias from a Bayesian perspective and demonstrate that it principally originates from label prior bias within the training data. Building upon this insight, we propose a debiasing method for FSSL named FedDB. FedDB utilizes the Average Prediction Probability of Unlabeled Data (APP-U) to approximate the biased prior.During local training, FedDB employs APP-U to refine pseudo-labeling through Bayes' theorem, thereby significantly reducing the label prior bias. Concurrently, during the model aggregation, FedDB uses APP-U from participating clients to formulate unbiased aggregate weights, thereby effectively diminishing bias in the global model. Experimental results show that FedDB can surpass existing FSSL methods. The code is available at https://github.com/GuogangZhu/FedDB.

app-u, learning, unlabeled data, (14 more...)

2405.19789

Country:

North America > United States > Virginia (0.04)
Asia > China > Beijing > Beijing (0.04)
North America > United States > Texas > Dallas County > Richardson (0.04)

Genre: Research Report > New Finding (0.48)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Unsupervised or Indirectly Supervised Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.71)

Qiu, Jiaxing, Lake, Douglas E., Henry, Teague R.

Fast leave-one-cluster-out cross-validation by clustered Network Information Criteria (NICc)

arXiv.org Machine LearningMay-30-2024

This paper introduced a clustered estimator of the Network Information Criterion (NICc) to approximate leave-one-cluster-out cross-validated deviance, which can be used as an alternative to cluster-based cross-validation when modeling clustered data. Stone proved that Akaike Information Criterion (AIC) is an asymptotic equivalence to leave-one-observation-out cross-validation if the parametric model is true. Ripley pointed out that the Network Information Criterion (NIC) derived in Stone's proof, is a better approximation to leave-one-observation-out cross-validation when the model is not true. For clustered data, we derived a clustered estimator of NIC, referred to as NICc, by substituting the Fisher information matrix in NIC with its estimator that adjusts for clustering. This adjustment imposes a larger penalty in NICc than the unclustered estimator of NIC when modeling clustered data, thereby preventing overfitting more effectively. In a simulation study and an empirical example, we used linear and logistic regression to model clustered data with Gaussian or binomial response, respectively. We showed that NICc is a better approximation to leave-one-cluster-out deviance and prevents overfitting more effectively than AIC and Bayesian Information Criterion (BIC). NICc leads to more accurate model selection, as determined by cluster-based cross-validation, compared to AIC and BIC.

information criterion, loodeviance, nicc, (12 more...)

2405.204

Country:

North America > United States > Virginia > Albemarle County > Charlottesville (0.14)
Europe > Netherlands > South Holland > Dordrecht (0.04)
South America > Uruguay > Maldonado > Maldonado (0.04)
(6 more...)

Genre: Research Report > New Finding (0.68)

Industry: Health & Medicine > Therapeutic Area > Pediatrics/Neonatology (0.47)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Cross Validation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.49)

Agrawal, Abhinav, Domke, Justin

Understanding and mitigating difficulties in posterior predictive evaluation

arXiv.org Machine LearningMay-30-2024

Predictive posterior densities (PPDs) are of interest in approximate Bayesian inference. Typically, these are estimated by simple Monte Carlo (MC) averages using samples from the approximate posterior. We observe that the signal-to-noise ratio (SNR) of such estimators can be extremely low. An analysis for exact inference reveals SNR decays exponentially as there is an increase in (a) the mismatch between training and test data, (b) the dimensionality of the latent space, or (c) the size of the test data relative to the training data. Further analysis extends these results to approximate inference. To remedy the low SNR problem, we propose replacing simple MC sampling with importance sampling using a proposal distribution optimized at test time on a variational proxy for the SNR and demonstrate that this yields greatly improved estimates.

estimator, inference, snr, (15 more...)

2405.19747

Country:

North America > United States > Massachusetts > Hampshire County > Amherst (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > Bangladesh > Dhaka Division > Dhaka District > Dhaka (0.04)

Genre:

Research Report > New Finding (0.67)
Research Report > Experimental Study (0.46)

Industry: Health & Medicine > Therapeutic Area (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)