AITopics

doi: 10.1109/TAES.2024.3517576

1912.08718

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
North America > United States > California > San Francisco County > San Francisco (0.14)
North America > United States > Massachusetts > Norfolk County > Norwood (0.04)
(5 more...)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Hu, Yunqiao, Sun, Shunqiao, Zhang, Yimin D.

Enhancing Off-Grid One-Bit DOA Estimation with Learning-Based Sparse Bayesian Approach for Non-Uniform Sparse Array

arXiv.org Machine LearningDec-14-2024

This paper tackles the challenge of one-bit off-grid direction of arrival (DOA) estimation in a single snapshot scenario based on a learning-based Bayesian approach. Firstly, we formulate the off-grid DOA estimation model, utilizing the first-order off-grid approximation, incorporating one-bit data quantization. Subsequently, we address this problem using the Sparse Bayesian based framework and solve iteratively. However, traditional Sparse Bayesian methods often face challenges such as high computational complexity and the need for extensive hyperparameter tuning. To balance estimation accuracy and computational efficiency, we propose a novel Learning-based Sparse Bayesian framework, which leverages an unrolled neural network architecture. This framework autonomously learns hyperparameters through supervised learning, offering more accurate off-grid DOA estimates and improved computational efficiency compared to some state-of-the-art methods. Furthermore, the proposed approach is applicable to both uniform linear arrays and non-uniform sparse arrays. Simulation results validate the effectiveness of the proposed framework.

artificial intelligence, bayesian inference, machine learning, (13 more...)

2412.10976

Country:

North America > United States > Alabama > Tuscaloosa County > Tuscaloosa (0.14)
North America > United States > Pennsylvania > Philadelphia County > Philadelphia (0.04)

Genre: Research Report > Promising Solution (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Li, Shenxiong, Rui, Huaxia

Dual Traits in Probabilistic Reasoning of Large Language Models

arXiv.org Artificial IntelligenceDec-14-2024

We conducted three experiments to investigate how large language models (LLMs) evaluate posterior probabilities. Our results reveal the coexistence of two modes in posterior judgment among state-of-the-art models: a normative mode, which adheres to Bayes' rule, and a representative-based mode, which relies on similarity -- paralleling human System 1 and System 2 thinking. Additionally, we observed that LLMs struggle to recall base rate information from their memory, and developing prompt engineering strategies to mitigate representative-based judgment may be challenging. We further conjecture that the dual modes of judgment may be a result of the contrastive loss function employed in reinforcement learning from human feedback. Our findings underscore the potential direction for reducing cognitive biases in LLMs and the necessity for cautious deployment of LLMs in critical areas. The remarkable advancements in large language models (LLMs) have ushered in a new era where these models rival human expertise across domains like academia, law, medicine, and finance [4, 12, 13, 22-24]. In this study, we explore how LLMs judge this posterior probability. A higher similarity corresponds to a higher assessed posterior probability. This study comprises three experiments with progressively stricter conditions, reducing the information available for posterior likelihood assessment. The structured test provides all information needed for normative judgment, the semi-structured test omits the diagnosticity of evidence, and the unstructured test requires LLMs to recall all components of Bayes' rule. Results reveal that LLMs' judgments shift from f Representativeness can be constructed through typicality or prototypicality. Typicality describes the common or average case of the class, whereas prototypicality embodies the most idealized and iconic version of the class. For instance, a typical example of a physicist is a smart man who likes math and physics, while a prototypical example of a physicist is Stephen Hawking. This study moves beyond bias detection to investigate the basis upon which LLMs assess probabilities. This has important practical implications for the integration of LLMs into various critical fields.

large language model, machine learning, natural language, (20 more...)

2412.11009

Country:

North America > United States (0.14)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > Switzerland > Basel-City > Basel (0.04)

Genre: Research Report > New Finding (1.00)

Industry: Health & Medicine (0.93)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.69)

arXiv.org Artificial IntelligenceDec-13-2024

Evidential time-to-event prediction with calibrated uncertainty quantification

Huang, Ling, Xing, Yucheng, Mishra, Swapnil, Denoeux, Thierry, Feng, Mengling

Time-to-event analysis provides insights into clinical prognosis and treatment recommendations. However, this task is more challenging than standard regression problems due to the presence of censored observations. Additionally, the lack of confidence assessment, model robustness, and prediction calibration raises concerns about the reliability of predictions. To address these challenges, we propose an evidential regression model specifically designed for time-to-event prediction. The proposed model quantifies both epistemic and aleatory uncertainties using Gaussian Random Fuzzy Numbers and belief functions, providing clinicians with uncertainty-aware survival time predictions. The model is trained by minimizing a generalized negative log-likelihood function accounting for data censoring. Experimental evaluations using simulated datasets with different data distributions and censoring conditions, as well as real-world datasets across diverse clinical applications, demonstrate that our model delivers both accurate and reliable performance, outperforming state-of-the-art methods. These results highlight the potential of our approach for enhancing clinical decision-making in survival analysis.

artificial intelligence, bayesian inference, machine learning, (17 more...)

2411.07853

Country:

Asia > Singapore > Central Region > Singapore (0.04)
North America > United States (0.04)
Europe > Portugal > Porto > Porto (0.04)
(2 more...)

Genre:

Research Report > Experimental Study (0.94)
Research Report > New Finding (0.68)

Industry:

Health & Medicine > Therapeutic Area > Oncology (1.00)
Health & Medicine > Diagnostic Medicine (1.00)
Law > Civil Rights & Constitutional Law (0.78)
Health & Medicine > Therapeutic Area > Hematology (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
(2 more...)

arXiv.org Machine LearningDec-12-2024

Self-test loss functions for learning weak-form operators and gradient flows

Gao, Yuan, Lang, Quanjun, Lu, Fei

The construction of loss functions presents a major challenge in data-driven modeling involving weak-form operators in PDEs and gradient flows, particularly due to the need to select test functions appropriately. We address this challenge by introducing self-test loss functions, which employ test functions that depend on the unknown parameters, specifically for cases where the operator depends linearly on the unknowns. The proposed self-test loss function conserves energy for gradient flows and coincides with the expected log-likelihood ratio for stochastic differential equations. Importantly, it is quadratic, facilitating theoretical analysis of identifiability and well-posedness of the inverse problem, while also leading to efficient parametric or nonparametric regression algorithms. It is computationally simple, requiring only low-order derivatives or even being entirely derivative-free, and numerical experiments demonstrate its robustness against noisy and discrete data.

artificial intelligence, loss function, machine learning, (19 more...)

2412.03506

Country: North America > United States (0.46)

Genre: Research Report (0.50)

Industry: Energy > Oil & Gas > Upstream (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.66)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.34)

Martinez, Azucena L. Jimenez, Sood, Kanika, Mahto, Rakeshkumar

Early Detection of At-Risk Students Using Machine Learning

arXiv.org Artificial IntelligenceDec-12-2024

This research presents preliminary work to address the challenge of identifying at-risk students using supervised machine learning and three unique data categories: engagement, demographics, and performance data collected from Fall 2023 using Canvas and the California State University, Fullerton dashboard. We aim to tackle the persistent challenges of higher education retention and student dropout rates by screening for at-risk students and building a high-risk identification system. By focusing on previously overlooked behavioral factors alongside traditional metrics, this work aims to address educational gaps, enhance student outcomes, and significantly boost student success across disciplines at the University. Pre-processing steps take place to establish a target variable, anonymize student information, manage missing data, and identify the most significant features. Given the mixed data types in the datasets and the binary classification nature of this study, this work considers several machine learning models, including Support Vector Machines (SVM), Naive Bayes, K-nearest neighbors (KNN), Decision Trees, Logistic Regression, and Random Forest. These models predict at-risk students and identify critical periods of the semester when student performance is most vulnerable. We will use validation techniques such as train test split and k-fold cross-validation to ensure the reliability of the models. Our analysis indicates that all algorithms generate an acceptable outcome for at-risk student predictions, while Naive Bayes performs best overall.

artificial intelligence, machine learning, student, (14 more...)

2412.09483

Country:

North America > United States > California > Orange County > Fullerton (0.04)
Asia > South Korea > Seoul > Seoul (0.04)
Asia > South Korea > Incheon > Incheon (0.04)

Genre:

Instructional Material (1.00)
Research Report > New Finding (0.48)
Research Report > Experimental Study (0.34)

Industry: Education > Educational Setting > Higher Education (0.35)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.70)

Kwisthout, Johan, Schroeder, Andrew

Speeding up approximate MAP by applying domain knowledge about relevant variables

arXiv.org Artificial IntelligenceDec-12-2024

The MAP problem in Bayesian networks is notoriously intractable, even when approximated. In an earlier paper we introduced the Most Frugal Explanation heuristic approach to solving MAP, by partitioning the set of intermediate variables (neither observed nor part of the MAP variables) into a set of relevant variables, which are marginalized out, and irrelevant variables, which will be assigned a sampled value from their domain. In this study we explore whether knowledge about which variables are relevant for a particular query (i.e., domain knowledge) speeds up computation sufficiently to beat both exact MAP as well as approximate MAP while giving reasonably accurate results. Our results are inconclusive, but also show that this probably depends on the specifics of the MAP query, most prominently the number of MAP variables.

artificial intelligence, bayesian inference, machine learning, (19 more...)

2412.09264

Country:

North America > United States > California > Santa Clara County > Palo Alto (0.04)
Europe > United Kingdom > Scotland > City of Edinburgh > Edinburgh (0.04)
Europe > Denmark > North Jutland > Aalborg (0.04)

Genre: Research Report > New Finding (1.00)

Industry: Health & Medicine > Therapeutic Area (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.50)

arXiv.org Artificial IntelligenceDec-12-2024

Temporal Causal Discovery in Dynamic Bayesian Networks Using Federated Learning

Chen, Jianhong, Ma, Ying, Yue, Xubo

Traditionally, learning the structure of a Dynamic Bayesian Network has been centralized, with all data pooled in one location. However, in real-world scenarios, data are often dispersed among multiple parties (e.g., companies, devices) that aim to collaboratively learn a Dynamic Bayesian Network while preserving their data privacy and security. In this study, we introduce a federated learning approach for estimating the structure of a Dynamic Bayesian Network from data distributed horizontally across different parties. We propose a distributed structure learning method that leverages continuous optimization so that only model parameters are exchanged during optimization. Experimental results on synthetic and real datasets reveal that our method outperforms other state-of-the-art techniques, particularly when there are many clients with limited individual sample sizes.

artificial intelligence, bayesian network, machine learning, (15 more...)

2412.09814

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
North America > United States > Virginia (0.04)
North America > United States > New York (0.04)
(8 more...)

Genre:

Research Report > New Finding (0.48)
Research Report > Promising Solution (0.34)

Industry:

Law (1.00)
Information Technology > Security & Privacy (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

arXiv.org Machine LearningDec-12-2024

Stochastic Learning of Non-Conjugate Variational Posterior for Image Classification

Lim, Kart-Leong

Large scale Bayesian nonparametrics (BNP) learner such as stochastic variational inference (SVI) can handle datasets with large class number and large training size at fractional cost. Like its predecessor, SVI rely on the assumption of conjugate variational posterior to approximate the true posterior. A more challenging problem is to consider large scale learning on non-conjugate posterior. Recent works in this direction are mostly associated with using Monte Carlo methods for approximating the learner. However, these works are usually demonstrated on non-BNP related task and less complex models such as logistic regression, due to higher computational complexity. In order to overcome the issue faced by SVI, we develop a novel approach based on the recently proposed variational maximization-maximization (VMM) learner to allow large scale learning on non-conjugate posterior. Unlike SVI, our VMM learner does not require closed-form expression for the variational posterior expectatations. Our only requirement is that the variational posterior is differentiable. In order to ensure convergence in stochastic settings, SVI rely on decaying step-sizes to slow its learning. Inspired by SVI and Adam, we propose the novel use of decaying step-sizes on both gradient and ascent direction in our VMM to significantly improve its learning. We show that our proposed methods is compatible with ResNet features when applied to large class number datasets such as MIT67 and SUN397. Finally, we compare our proposed learner with several recent works such as deep clustering algorithms and showed we were able to produce on par or outperform the state-of-the-art methods in terms of clustering measures.

dataset, iteration, posterior, (16 more...)

2412.08951

Country: Asia > Middle East > Jordan (0.05)

Genre:

Research Report > Promising Solution (0.54)
Research Report > New Finding (0.48)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.94)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.93)
(2 more...)

arXiv.org Machine LearningDec-11-2024

Bayesian optimized deep ensemble for uncertainty quantification of deep neural networks: a system safety case study on sodium fast reactor thermal stratification modeling

Abulawi, Zaid, Hu, Rui, Balaprakash, Prasanna, Liu, Yang

Accurate predictions and uncertainty quantification (UQ) are essential for decision-making in risk-sensitive fields such as system safety modeling. Deep ensembles (DEs) are efficient and scalable methods for UQ in Deep Neural Networks (DNNs); however, their performance is limited when constructed by simply retraining the same DNN multiple times with randomly sampled initializations. To overcome this limitation, we propose a novel method that combines Bayesian optimization (BO) with DE, referred to as BODE, to enhance both predictive accuracy and UQ. We apply BODE to a case study involving a Densely connected Convolutional Neural Network (DCNN) trained on computational fluid dynamics (CFD) data to predict eddy viscosity in sodium fast reactor thermal stratification modeling. Compared to a manually tuned baseline ensemble, BODE estimates total uncertainty approximately four times lower in a noise-free environment, primarily due to the baseline's overestimation of aleatoric uncertainty. Specifically, BODE estimates aleatoric uncertainty close to zero, while aleatoric uncertainty dominates the total uncertainty in the baseline ensemble. We also observe a reduction of more than 30% in epistemic uncertainty. When Gaussian noise with standard deviations of 5% and 10% is introduced into the data, BODE accurately fits the data and estimates uncertainty that aligns with the data noise. These results demonstrate that BODE effectively reduces uncertainty and enhances predictions in data-driven models, making it a flexible approach for various applications requiring accurate predictions and robust UQ.

artificial intelligence, machine learning, noise, (17 more...)

2412.08776

Country: North America > United States > Texas (0.14)

Genre:

Research Report > New Finding (0.66)
Research Report > Promising Solution (0.66)

Industry:

Energy > Power Industry > Utilities > Nuclear (1.00)
Health & Medicine (0.93)
Energy > Oil & Gas > Upstream (0.93)
Government > Regional Government (0.67)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.67)