AITopics

2407.10383

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.13)
Europe > Germany > Baden-Württemberg > Freiburg (0.04)
Asia > Middle East > Jordan (0.04)
(5 more...)

Genre:

Summary/Review (1.00)
Research Report > New Finding (1.00)
Instructional Material > Course Syllabus & Notes (0.92)

Industry:

Transportation (1.00)
Education (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)
(12 more...)

arXiv.org Artificial IntelligenceJul-14-2024

Improving Graph Out-of-distribution Generalization on Real-world Data

Xu, Can, Cheng, Yao, Yu, Jianxiang, Wang, Haosen, Lv, Jingsong, Li, Xiang

Existing methods for graph out-of-distribution (OOD) generalization primarily rely on empirical studies on synthetic datasets. Such approaches tend to overemphasize the causal relationships between invariant sub-graphs and labels, thereby neglecting the non-negligible role of environment in real-world scenarios. In contrast to previous studies that impose rigid independence assumptions on environments and invariant sub-graphs, this paper presents the theorems of environment-label dependency and mutable rationale invariance, where the former characterizes the usefulness of environments in determining graph labels while the latter refers to the mutable importance of graph rationales. Based on analytic investigations, a novel variational inference based method named ``Probability Dependency on Environments and Rationales for OOD Graphs on Real-world Data'' (DEROG) is introduced. To alleviate the adverse effect of unknown prior knowledge on environments and rationales, DEROG utilizes generalized Bayesian inference. Further, DEROG employs an EM-based algorithm for optimization. Finally, extensive experiments on real-world datasets under different distribution shifts are conducted to show the superiority of DEROG. Our code is publicly available at https://anonymous.4open.science/r/DEROG-536B.

derog, graph rationale, rationale, (14 more...)

2407.10204

Country:

Asia > China > Shanghai > Shanghai (0.04)
Europe > France > Hauts-de-France > Nord > Lille (0.04)
Asia > China > Zhejiang Province > Hangzhou (0.04)
Asia > China > Jiangsu Province > Nanjing (0.04)

Genre: Research Report > New Finding (1.00)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.66)

Technology:

Information Technology > Artificial Intelligence > Natural Language (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.34)

arXiv.org Artificial IntelligenceJul-14-2024

Maximum Likelihood Estimation of the Direction of Sound In A Reverberant Noisy Environment

Mansour, Mohamed F.

We describe a new method for estimating the direction of sound in a reverberant environment from basic principles of sound propagation. The method utilizes SNR-adaptive features from time-delay and energy of the directional components after acoustic wave decomposition of the observed sound field to estimate the line-of-sight direction under noisy and reverberant conditions. The effectiveness of the approach is established with measured data of different microphone array configurations under various usage scenarios.

directional component, line-of-sight component, microphone array, (12 more...)

2406.17103

Country: North America > United States (0.04)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)
Information Technology > Artificial Intelligence > Speech > Speech Recognition (0.68)

Mielniczuk, Jan, Wawrzeńczyk, Adam

Augmented prediction of a true class for Positive Unlabeled data under selection bias

arXiv.org Machine LearningJul-14-2024

We introduce a new observational setting for Positive Unlabeled (PU) data where the observations at prediction time are also labeled. This occurs commonly in practice -- we argue that the additional information is important for prediction, and call this task "augmented PU prediction". We allow for labeling to be feature dependent. In such scenario, Bayes classifier and its risk is established and compared with a risk of a classifier which for unlabeled data is based only on predictors. We introduce several variants of the empirical Bayes rule in such scenario and investigate their performance. We emphasise dangers (and ease) of applying classical classification rule in the augmented PU scenario -- due to no preexisting studies, an unaware researcher is prone to skewing the obtained predictions. We conclude that the variant based on recently proposed variational autoencoder designed for PU scenario works on par or better than other considered variants and yields advantage over feature-only based methods in terms of accuracy for unlabeled samples.

dataset, prediction, scenario, (16 more...)

arXiv.org Machine Learning

2407.10309

Country:

North America > United States > New York > New York County > New York City (0.04)
North America > United States > California > Santa Clara County > Palo Alto (0.04)
North America > Canada > Ontario > Toronto (0.04)
Europe > Poland > Masovia Province > Warsaw (0.04)

Genre: Research Report (0.50)

Industry: Health & Medicine > Therapeutic Area (0.47)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.87)

Felicioni, Nicolò, Maystre, Lucas, Ghiassian, Sina, Ciosek, Kamil

On the Importance of Uncertainty in Decision-Making with Large Language Models

arXiv.org Artificial IntelligenceJul-13-2024

We investigate the role of uncertainty in decision-making problems with natural language as input. For such tasks, using Large Language Models as agents has become the norm. However, none of the recent approaches employ any additional phase for estimating the uncertainty the agent has about the world during the decision-making task. We focus on a fundamental decision-making framework with natural language as input, which is the one of contextual bandits, where the context information consists of text. As a representative of the approaches with no uncertainty estimation, we consider an LLM agent with a greedy policy, which picks the action corresponding to the largest predicted reward. We compare this baseline to LLM agents that make active use of uncertainty estimation by integrating the uncertainty in a Thompson Sampling policy. We employ different techniques for uncertainty estimation, such as Laplace Approximation, Dropout, and Epinets. We empirically show on real-world data that the greedy policy performs worse than the Thompson Sampling policies. These findings suggest that, while overlooked in the LLM literature, uncertainty improves performance on bandit tasks with LLM agents.

epistemic uncertainty, machine learning research, posterior distribution, (15 more...)

2404.02649

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > United States > Hawaii > Honolulu County > Honolulu (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.93)

arXiv.org Artificial IntelligenceJul-13-2024

Predictive Dynamic Fusion

Cao, Bing, Xia, Yinan, Ding, Yi, Zhang, Changqing, Hu, Qinghua

Multimodal fusion is crucial in joint decision-making systems for rendering holistic judgments. Since multimodal data changes in open environments, dynamic fusion has emerged and achieved remarkable progress in numerous applications. However, most existing dynamic multimodal fusion methods lack theoretical guarantees and easily fall into suboptimal problems, yielding unreliability and instability. To address this issue, we propose a Predictive Dynamic Fusion (PDF) framework for multimodal learning. We proceed to reveal the multimodal fusion from a generalization perspective and theoretically derive the predictable Collaborative Belief (Co-Belief) with Mono- and Holo-Confidence, which provably reduces the upper bound of generalization error. Accordingly, we further propose a relative calibration strategy to calibrate the predicted Co-Belief for potential uncertainty. Extensive experiments on multiple benchmarks confirm our superiority. Our code is available at https://github.com/Yinan-Xia/PDF.

fusion, generalization error, modality, (16 more...)

2406.04802

Country:

Europe > Austria > Vienna (0.14)
Asia > China > Tianjin Province > Tianjin (0.04)
North America > United States > California (0.04)
(2 more...)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Data Science (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.46)

Babakov, Nikolay, Reiter, Ehud, Bugarin, Alberto

Scalability of Bayesian Network Structure Elicitation with Large Language Models: a Novel Methodology and Comparative Analysis

In this work, we propose a novel method for Bayesian Networks (BNs) structure elicitation that is based on the initialization of several LLMs with different experiences, independently querying them to create a structure of the BN, and further obtaining the final structure by majority voting. We compare the method with one alternative method on various widely and not widely known BNs of different sizes and study the scalability of both methods on them. We also propose an approach to check the contamination of BNs in LLM, which shows that some widely known BNs are inapplicable for testing the LLM usage for BNs structure elicitation. We also show that some BNs may be inapplicable for such experiments because their node names are indistinguishable. The experiments on the other BNs show that our method performs better than the existing method with one of the three studied LLMs; however, the performance of both methods significantly decreases with the increase in BN size.

large language model, machine learning, node, (18 more...)

2407.09311

Genre: Research Report > New Finding (0.67)

Industry:

Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (0.93)
Construction & Engineering (0.67)
Energy > Oil & Gas (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Flow-Based Generative Emulation of Grids of Stellar Evolutionary Models

Hon, Marc, Li, Yaguang, Ong, Joel

ABSTRACT We present a flow-based generative approach to emulate grids of stellar evolutionary models. By interpreting the input parameters and output properties of these models as multi-dimensional probability distributions, we train conditional normalizing flows to learn and predict the complex relationships between grid inputs and outputs in the form of conditional joint distributions. Leveraging the expressive power and versatility of these flows, we showcase their ability to emulate a variety of evolutionary tracks and isochrones across a continuous range of input parameters. In addition, we describe a simple Bayesian approach for estimating stellar parameters using these flows and demonstrate its application to asteroseismic datasets of red giants observed by the Kepler mission. By applying this approach to red giants in open clusters NGC 6791 and NGC 6819, we illustrate how large age uncertainties can arise when fitting only to global asteroseismic and spectroscopic parameters without prior information on initial helium abundances and mixing length parameter values. We also conduct inference using the flow at a large scale by determining revised estimates of masses and radii for 15,388 field red giants. These estimates show improved agreement with results from existing grid-based modelling, reveal distinct population-level features in the red clump, and suggest that the masses of Kepler red giants previously determined using the corrected asteroseismic scaling relations have been overestimated by 5 10%.

artificial intelligence, bayesian inference, machine learning, (17 more...)

2407.09427

Country: North America > United States > Massachusetts (0.28)

Genre: Research Report > New Finding (0.68)

Industry: Energy > Oil & Gas (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.67)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.66)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.48)

Real-time gravitational-wave inference for binary neutron stars using machine learning

Dax, Maximilian, Green, Stephen R., Gair, Jonathan, Gupte, Nihar, Pürrer, Michael, Raymond, Vivien, Wildberger, Jonas, Macke, Jakob H., Buonanno, Alessandra, Schölkopf, Bernhard

Mergers of binary neutron stars (BNSs) emit signals in both the gravitational-wave (GW) and electromagnetic (EM) spectra. Famously, the 2017 multi-messenger observation of GW170817 led to scientific discoveries across cosmology, nuclear physics, and gravity. Central to these results were the sky localization and distance obtained from GW data, which, in the case of GW170817, helped to identify the associated EM transient, AT 2017gfo, 11 hours after the GW signal. Fast analysis of GW data is critical for directing time-sensitive EM observations; however, due to challenges arising from the length and complexity of signals, it is often necessary to make approximations that sacrifice accuracy. Here, we develop a machine learning approach that performs complete BNS inference in just one second without making any such approximations. This is enabled by a new method for explicit integration of physical domain knowledge into neural networks. Our approach enhances multi-messenger observations by providing (i) accurate localization even before the merger; (ii) improved localization precision by $\sim30\%$ compared to approximate low-latency methods; and (iii) detailed information on luminosity distance, inclination, and masses, which can be used to prioritize expensive telescope time. Additionally, the flexibility and reduced cost of our method open new opportunities for equation-of-state and waveform systematics studies. Finally, we demonstrate that our method scales to extremely long signals, up to an hour in length, thus serving as a blueprint for data analysis for next-generation ground- and space-based detectors.

artificial intelligence, bayesian inference, machine learning, (19 more...)

2407.09602

Country:

Europe > Germany (0.69)
North America > United States > Maryland (0.28)

Genre: Research Report (0.82)

Industry: Energy > Oil & Gas > Upstream (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.67)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.66)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.48)

Raichur, Nisha L., Heublein, Lucas, Feigl, Tobias, Rügamer, Alexander, Mutschler, Christopher, Ott, Felix

Bayesian Learning-driven Prototypical Contrastive Loss for Class-Incremental Learning

The primary objective of methods in continual learning is to learn tasks in a sequential manner over time from a stream of data, while mitigating the detrimental phenomenon of catastrophic forgetting. In this paper, we focus on learning an optimal representation between previous class prototypes and newly encountered ones. We propose a prototypical network with a Bayesian learning-driven contrastive loss (BLCL) tailored specifically for class-incremental learning scenarios. Therefore, we introduce a contrastive loss that incorporates new classes into the latent representation by reducing the intra-class distance and increasing the inter-class distance. Our approach dynamically adapts the balance between the cross-entropy and contrastive loss functions with a Bayesian learning technique. Empirical evaluations conducted on both the CIFAR-10 and CIFAR-100 dataset for image classification and images of a GNSS-based dataset for interference classification validate the efficacy of our method, showcasing its superiority over existing state-of-the-art approaches.

blcl 8, dataset, learning, (12 more...)

2405.11067

Country:

Europe > Austria > Vienna (0.14)
North America > United States > Tennessee > Davidson County > Nashville (0.04)
North America > United States > Hawaii > Honolulu County > Honolulu (0.04)
(14 more...)

Genre: Research Report > Promising Solution (0.48)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.90)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)