AITopics

2410.08949

Country:

North America > United States (0.04)
North America > Canada > Quebec > Capitale-Nationale Region > Québec (0.04)
North America > Canada > Quebec > Capitale-Nationale Region > Quebec City (0.04)
(4 more...)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.88)

arXiv.org Machine LearningOct-17-2024

Generalization Error of the Tilted Empirical Risk

Aminian, Gholamali, Asadi, Amir R., Li, Tian, Beirami, Ahmad, Reinert, Gesine, Cohen, Samuel N.

The generalization error (risk) of a supervised statistical learning algorithm quantifies its prediction ability on previously unseen data. Inspired by exponential tilting, Li et al. (2021) proposed the tilted empirical risk as a non-linear risk metric for machine learning applications such as classification and regression problems. In this work, we examine the generalization error of the tilted empirical risk. In particular, we provide uniform and information-theoretic bounds on the tilted generalization error, defined as the difference between the population risk and the tilted empirical risk, with a convergence rate of $O(1/\sqrt{n})$ where $n$ is the number of training samples. Furthermore, we study the solution to the KL-regularized expected tilted empirical risk minimization problem and derive an upper bound on the expected tilted generalization error with a convergence rate of $O(1/n)$.

exp, generalization error, tilted generalization error, (16 more...)

2409.19431

Country:

Europe > United Kingdom > England > Oxfordshire > Oxford (0.14)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
North America > United States > Virginia (0.04)
(3 more...)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.66)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)

arXiv.org Machine LearningOct-16-2024

Local transfer learning Gaussian process modeling, with applications to surrogate modeling of expensive computer simulators

Wang, Xinming, Mak, Simon, Miller, John, Wu, Jianguo

A critical bottleneck for scientific progress is the costly nature of computer simulations for complex systems. Surrogate models provide an appealing solution: such models are trained on simulator evaluations, then used to emulate and quantify uncertainty on the expensive simulator at unexplored inputs. In many applications, one often has available data on related systems. For example, in designing a new jet turbine, there may be existing studies on turbines with similar configurations. A key question is how information from such "source" systems can be transferred for effective surrogate training on the "target" system of interest. We thus propose a new LOcal transfer Learning Gaussian Process (LOL-GP) model, which leverages a carefully-designed Gaussian process to transfer such information for surrogate modeling. The key novelty of the LOL-GP is a latent regularization model, which identifies regions where transfer should be performed and regions where it should be avoided. This "local transfer" property is desirable in scientific systems: at certain parameters, such systems may behave similarly and thus transfer is beneficial; at other parameters, they may behave differently and thus transfer is detrimental. By accounting for local transfer, the LOL-GP can rectify a critical limitation of "negative transfer" in existing transfer learning models, where the transfer of information worsens predictive performance. We derive a Gibbs sampling algorithm for efficient posterior predictive sampling on the LOL-GP, for both the multi-source and multi-fidelity transfer settings. We then show, via a suite of numerical experiments and an application for jet turbine design, the improved surrogate performance of the LOL-GP over existing methods.

artificial intelligence, machine learning, modeling & simulation, (19 more...)

2410.1269

Genre: Research Report (1.00)

Industry:

Energy > Oil & Gas (0.67)
Aerospace & Defense (0.46)

Technology:

Information Technology > Modeling & Simulation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Transfer Learning (0.84)
(3 more...)

Ingole, Balaji Shesharao, Ramineni, Vishnu, Bangad, Nikhil, Ganeeb, Koushik Kumar, Patel, Priyankkumar

Advancements In Heart Disease Prediction: A Machine Learning Approach For Early Detection And Risk Assessment

The primary aim of this paper is to comprehend, assess, and analyze the role, relevance, and efficiency of machine learning models in predicting heart disease risks using clinical data. While the importance of heart disease risk prediction cannot be overstated, the application of machine learning (ML) in identifying and evaluating the impact of various features on the classification of patients with and without heart disease, as well as in generating a reliable clinical dataset, is equally significant. This study relies primarily on cross-sectional clinical data. The ML approach is designed to enhance the consideration of various clinical features in the heart disease prognosis process. Some features emerge as strong predictors, adding significant value. The paper evaluates seven ML classifiers: Logistic Regression, Random Forest, Decision Tree, Naive Bayes, k-Nearest Neighbors, Neural Networks, and Support Vector Machine (SVM). The performance of each model is assessed based on accuracy metrics. Notably, the Support Vector Machine (SVM) demonstrates the highest accuracy at 91.51%, confirming its superiority among the evaluated models in terms of predictive capability. The overall findings of this research highlight the advantages of advanced computational methodologies in the evaluation, prediction, improvement, and management of cardiovascular risks. In other words, the strong performance of the SVM model illustrates its applicability and value in clinical settings, paving the way for further advancements in personalized medicine and healthcare.

artificial intelligence, heart disease prediction, machine learning, (10 more...)

2410.14738

Country:

North America > United States > Georgia > Columbia County > Evans (0.04)
North America > United States > California > Orange County > Irvine (0.04)

Genre:

Research Report > Experimental Study (0.91)
Research Report > New Finding (0.89)

Industry: Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.50)

Ueda, Kenji, Nishiura, Masaki.

Nonlinear bayesian tomography of ion temperature and velocity for Doppler coherence imaging spectroscopy in RT-1

We present a novel Bayesian tomography approach for Coherence Imaging Spectroscopy (CIS) that simultaneously reconstructs ion temperature and velocity distributions in plasmas. Utilizing nonlinear Gaussian Process Tomography (GPT) with the Laplace approximation, we model prior distributions of log-emissivity, temperature, and velocity as Gaussian processes. This framework rigorously incorporates nonlinear effects and temperature dependencies often neglected in conventional CIS tomography, enabling robust reconstruction even in the region of high temperature and velocity. By applying a log-Gaussian process, we also address issues like velocity divergence in low-emissivity regions. Validated with phantom simulations and experimental data from the RT-1 device, our method reveals detailed spatial structures of ion temperature and toroidal ion flow characteristic of magnetospheric plasma. This work significantly broadens the scope of CIS tomography, offering a robust tool for plasma diagnostics and facilitating integration with complementary measurement techniques.

artificial intelligence, machine learning, modeling & simulation, (19 more...)

2410.12424

Country:

Europe > Netherlands > Zeeland (0.04)
Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report (0.64)

Technology:

Information Technology > Modeling & Simulation (0.76)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.68)

Advancing Fairness in Natural Language Processing: From Traditional Methods to Explainability

Jourdan, Fanny

The burgeoning field of Natural Language Processing (NLP) stands at a critical juncture where the integration of fairness within its frameworks has become an imperative. This PhD thesis addresses the need for equity and transparency in NLP systems, recognizing that fairness in NLP is not merely a technical challenge but a moral and ethical necessity, requiring a rigorous examination of how these technologies interact with and impact diverse human populations. Through this lens, this thesis undertakes a thorough investigation into the development of equitable NLP methodologies and the evaluation of biases that prevail in current systems. First, it introduces an innovative algorithm to mitigate biases in multi-class classifiers, tailored for high-risk NLP applications, surpassing traditional methods in both bias mitigation and prediction accuracy. Then, an analysis of the Bios dataset reveals the impact of dataset size on discriminatory biases and the limitations of standard fairness metrics. This awareness has led to explorations in the field of explainable AI, aiming for a more complete understanding of biases where traditional metrics are limited. Consequently, the thesis presents COCKATIEL, a model-agnostic explainability method that identifies and ranks concepts in Transformer models, outperforming previous approaches in sentiment analysis tasks. Finally, the thesis contributes to bridging the gap between fairness and explainability by introducing TaCo, a novel method to neutralize bias in Transformer model embeddings. In conclusion, this thesis constitutes a significant interdisciplinary endeavor that intertwines explicability and fairness to challenge and reshape current NLP paradigms. The methodologies and critiques presented contribute to the ongoing discourse on fairness in machine learning, offering actionable solutions for more equitable and responsible AI systems.

machine learning, natural language, singular value decomposition, (22 more...)

2410.12511

Country:

Europe > France > Occitanie > Haute-Garonne > Toulouse (0.04)
North America > United States > New York > New York County > New York City (0.04)
North America > United States > Oregon > Multnomah County > Portland (0.04)
(9 more...)

Genre:

Research Report > Promising Solution (1.00)
Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)
Overview (1.00)

Industry:

Leisure & Entertainment (1.00)
Information Technology > Security & Privacy (1.00)
Health & Medicine (1.00)
(5 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Natural Language > Explanation & Argumentation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
(4 more...)

Double-Bayesian Learning

Jaeger, Stefan

Contemporary machine learning methods will try to approach the Bayes error, as it is the lowest possible error any model can achieve. This paper postulates that any decision is composed of not one but two Bayesian decisions and that decision-making is, therefore, a double-Bayesian process. The paper shows how this duality implies intrinsic uncertainty in decisions and how it incorporates explainability. The proposed approach understands that Bayesian learning is tantamount to finding a base for a logarithmic function measuring uncertainty, with solutions being fixed points. Furthermore, following this approach, the golden ratio describes possible solutions satisfying Bayes' theorem. The double-Bayesian framework suggests using a learning rate and momentum weight with values similar to those used in the literature to train neural networks with stochastic gradient descent.

artificial intelligence, bayesian inference, machine learning, (17 more...)

2410.12984

Country: North America > United States > Maryland > Montgomery County > Bethesda (0.04)

Genre: Research Report (0.50)

Industry: Health & Medicine (0.93)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Kee, Patrick D., Brown, Max J., Rice, Jonathan C., Howell, Christian A.

The Bayesian Confidence (BACON) Estimator for Deep Neural Networks

This paper introduces the Bayesian Confidence Estimator (BACON) for deep neural networks. Current practice of interpreting Softmax values in the output layer as probabilities of outcomes is prone to extreme predictions of class probability. In this work we extend Waagen's method of representing the terminal layers with a geometric model, where the probability associated with an output vector is estimated with Bayes' Rule using validation data to provide likelihood and normalization values. This estimator provides superior ECE and ACE calibration error compared to Softmax for ResNet-18 at 85% network accuracy, and EfficientNet-B0 at 95% network accuracy, on the CIFAR-10 dataset with an imbalanced test set, except for very high accuracy edge cases. In addition, when using the ACE metric, BACON demonstrated improved calibration error when estimating probabilities for the imbalanced test set when using actual class distribution fractions.

accuracy, artificial intelligence, machine learning, (18 more...)

2410.12604

Country:

North America > United States > Utah > Utah County > Provo (0.04)
North America > Canada > Quebec > Montreal (0.04)
Oceania > Australia > New South Wales > Sydney (0.04)
(2 more...)

Genre: Research Report (1.00)

Industry: Government > Military (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.89)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.89)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.87)

Chau, Siu Lun, Schrab, Antonin, Gretton, Arthur, Sejdinovic, Dino, Muandet, Krikamol

Credal Two-Sample Tests of Epistemic Ignorance

arXiv.org Machine LearningOct-16-2024

Science is inherently inductive and thus involves uncertainties. They are commonly categorized as aleatoric uncertainty (AU), which refers to inherent variability, and epistemic uncertainty (EU), arising from limited information such as finite data or model assumptions (Hora, 1996). These uncertainties often overlap, as scientists may be epistemically uncertain about the aleatoric variation in their inquiry. Distinguishing and acknowledging them is crucial for the safe and trustworthy deployment of intelligent systems (Kendall and Gal, 2017; Hüllermeier and Waegeman, 2021), as they lead to different down-stream decisions. For example, experimental design aims to reduce EU (Nguyen et al., 2019; Chau et al., 2021b; Adachi et al., 2024), while risk management uses hedging strategy to address AU (Mashrur et al., 2020) While AU is often modelled using probability distributions, modelling EU--particularly in states of epistemic ignorance, also known as partial ignorance or incomplete knowledge (Dubois et al., 1996)--poses greater challenges. For instance, a scientist analysing insulin levels in Germany may have data from multiple hospitals, each representing aleatoric variation as a probability distribution. However, these distributions are merely proxies for the population-level insulin distribution, which is difficult to infer due to data collection limitations. A Bayesian approach could aggregate the data based on a prior if the representativeness of each source is known, but in many cases, scientists operate under partial ignorance, lacking such prior information (Bromberger, 1971). Assigning a uniform prior by following the principle of indifference (Keynes, 1921) and maximum entropy principle (Jaynes, 1957), or applying Jeffrey's prior by following the principle of transformation groups (Jaynes, 1968) only reflects indifference, not epistemic ignorance.

credal two-sample test, epistemic ignorance, extreme point, (12 more...)

2410.12921

Country:

Oceania > Australia > South Australia > Adelaide (0.04)
North America > United States > New Jersey > Hudson County > Hoboken (0.04)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
(6 more...)

Genre: Research Report > Experimental Study (0.46)

Industry:

Health & Medicine (1.00)
Information Technology > Security & Privacy (0.48)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.65)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.65)

He, Jiajun, Chen, Wenlin, Zhang, Mingtian, Barber, David, Hernández-Lobato, José Miguel

Training Neural Samplers with Reverse Diffusive KL Divergence

arXiv.org Machine LearningOct-16-2024

Training generative models to sample from unnormalized density functions is an important and challenging task in machine learning. Traditional training methods often rely on the reverse Kullback-Leibler (KL) divergence due to its tractability. However, the mode-seeking behavior of reverse KL hinders effective approximation of multi-modal target distributions. To address this, we propose to minimize the reverse KL along diffusion trajectories of both model and target densities. We refer to this objective as the reverse diffusive KL divergence, which allows the model to capture multiple modes. Leveraging this objective, we train neural samplers that can efficiently generate samples from the target distribution in one step. We demonstrate that our method enhances sampling performance across various Boltzmann distributions, including both synthetic multi-modal densities and n-body particle systems.

divergence, sampler, training neural sampler, (12 more...)

2410.12456

Country:

North America (0.14)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > Middle East > Jordan (0.04)
Asia > China (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.92)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.92)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)