AITopics

2504.11353

Country:

Asia > China > Henan Province > Zhengzhou (0.04)
Asia > Singapore (0.04)
Asia > Myanmar > Tanintharyi Region > Dawei (0.04)
(3 more...)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

CLEAR-KGQA: Clarification-Enhanced Ambiguity Resolution for Knowledge Graph Question Answering

Wen, Liqiang, Xiong, Guanming, Mo, Tong, Li, Bing, Li, Weiping, Zhao, Wen

This study addresses the challenge of ambiguity in knowledge graph question answering (KGQA). While recent KGQA systems have made significant progress, particularly with the integration of large language models (LLMs), they typically assume user queries are unambiguous, which is an assumption that rarely holds in real-world applications. To address these limitations, we propose a novel framework that dynamically handles both entity ambiguity (e.g., distinguishing between entities with similar names) and intent ambiguity (e.g., clarifying different interpretations of user queries) through interactive clarification. Our approach employs a Bayesian inference mechanism to quantify query ambiguity and guide LLMs in determining when and how to request clarification from users within a multi-turn dialogue framework. We further develop a two-agent interaction framework where an LLM-based user simulator enables iterative refinement of logical forms through simulated user feedback. Experimental results on the WebQSP and CWQ dataset demonstrate that our method significantly improves performance by effectively resolving semantic ambiguities. Additionally, we contribute a refined dataset of disambiguated queries, derived from interaction histories, to facilitate future research in this direction.

ambiguity, large language model, machine learning, (19 more...)

2504.09665

Country:

North America > United States (0.46)
Asia > China (0.30)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Shaposhnyk, Olha, Zahorska, Daria, Yanushkevich, Svetlana

Can LLMs Assist Expert Elicitation for Probabilistic Causal Modeling?

Objective: This study investigates the potential of Large Language Models (LLMs) as an alternative to human expert elicitation for extracting structured causal knowledge and facilitating causal modeling in biometric and healthcare applications. Material and Methods: LLM-generated causal structures, specifically Bayesian networks (BNs), were benchmarked against traditional statistical methods (e.g., Bayesian Information Criterion) using healthcare datasets. Validation techniques included structural equation modeling (SEM) to verifying relationships, and measures such as entropy, predictive accuracy, and robustness to compare network structures. Results and Discussion: LLM-generated BNs demonstrated lower entropy than expert-elicited and statistically generated BNs, suggesting higher confidence and precision in predictions. However, limitations such as contextual constraints, hallucinated dependencies, and potential biases inherited from training data require further investigation. Conclusion: LLMs represent a novel frontier in expert elicitation for probabilistic causal modeling, promising to improve transparency and reduce uncertainty in the decision-making using such models.

large language model, machine learning, natural language, (19 more...)

2504.10397

Country: North America > Canada (0.28)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (0.69)

Industry:

Health & Medicine > Therapeutic Area (1.00)
Health & Medicine > Consumer Health (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Matsui, Yuta, Yamaki, Ryosuke, Ueda, Ryo, Shinagawa, Seitaro, Taniguchi, Tadahiro

Metropolis-Hastings Captioning Game: Knowledge Fusion of Vision Language Models via Decentralized Bayesian Inference

We propose the Metropolis-Hastings Captioning Game (MHCG), a method to fuse knowledge of multiple vision-language models (VLMs) by learning from each other. Although existing methods that combine multiple models suffer from inference costs and architectural constraints, MHCG avoids these problems by performing decentralized Bayesian inference through a process resembling a language game. The knowledge fusion process establishes communication between two VLM agents alternately captioning images and learning from each other. We conduct two image-captioning experiments with two VLMs, each pre-trained on a different dataset. The first experiment demonstrates that MHCG achieves consistent improvement in reference-free evaluation metrics. The second experiment investigates how MHCG contributes to sharing VLMs' category-level vocabulary by observing the occurrence of the vocabulary in the generated captions.

artificial intelligence, machine learning, natural language, (16 more...)

2504.0962

Country:

Asia > Japan (0.28)
North America > United States (0.28)
Europe > Switzerland (0.28)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Information Fusion (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
(2 more...)

Constantinou, Anthony C., Higgins, Nicholas, Kitson, Neville K.

Decoding the mechanisms of the Hattrick football manager game using Bayesian network structure learning for optimal decision-making

Hattrick is a free web-based probabilistic football manager game with over 200,000 users competing for titles at national and international levels. Launched in Sweden in 1997 as part of an MSc project, the game's slow-paced design has fostered a loyal community, with many users remaining active for decades. Hattrick's game-engine mechanics are partially hidden, and users have attempted to decode them with incremental success over the years. Rule-based, statistical and machine learning models have been developed to aid this effort and are widely used by the community. However, these models or tools have not been formally described or evaluated in the scientific literature. This study is the first to explore Hattrick using structure learning techniques and Bayesian networks, integrating both data and domain knowledge to develop models capable of explaining and simulating the game engine. We present a comprehensive analysis assessing the effectiveness of structure learning algorithms in relation to knowledge-based structures, and show that while structure learning may achieve a higher overall network fit, it does not result in more accurate predictions for selected variables of interest, when compared to knowledge-based networks that produce a lower overall network fit. Additionally, we introduce and publicly share a fully specified Bayesian network model that matches the performance of top models used by the Hattrick community. We further demonstrate how analysis extends beyond prediction by providing a visual representation of conditional dependencies, and using the best performing Bayesian network model for in-game decision-making. To support future research, we make all data, graphical structures, and models publicly available online.

artificial intelligence, bayesian inference, machine learning, (20 more...)

2504.09499

Country:

Europe > United Kingdom > England (0.45)
North America > United States (0.45)

Genre: Research Report > New Finding (1.00)

Industry:

Leisure & Entertainment > Sports > Soccer (1.00)
Leisure & Entertainment > Games > Computer Games (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Padmanabhan, Sriram, Misra, Kanishka, Mahowald, Kyle, Choi, Eunsol

On Language Models' Sensitivity to Suspicious Coincidences

Humans are sensitive to suspicious coincidences when generalizing inductively over data, as they make assumptions as to how the data was sampled. This results in smaller, more specific hypotheses being favored over more general ones. For instance, when provided the set {Austin, Dallas, Houston}, one is more likely to think that this is sampled from "Texas Cities" over "US Cities" even though both are compatible. Suspicious coincidence is strongly connected to pragmatic reasoning, and can serve as a testbed to analyze systems on their sensitivity towards the communicative goals of the task (i.e., figuring out the true category underlying the data). In this paper, we analyze whether suspicious coincidence effects are reflected in language models' (LMs) behavior. We do so in the context of two domains: 1) the number game, where humans made judgments of whether a number (e.g., 4) fits a list of given numbers (e.g., 16, 32, 2); and 2) by extending the number game setup to prominent cities. For both domains, the data is compatible with multiple hypotheses and we study which hypothesis is most consistent with the models' behavior. On analyzing five models, we do not find strong evidence for suspicious coincidences in LMs' zero-shot behavior. However, when provided access to the hypotheses space via chain-of-thought or explicit prompting, LMs start to show an effect resembling suspicious coincidences, sometimes even showing effects consistent with humans. Our study suggests that inductive reasoning behavior in LMs can be enhanced with explicit access to the hypothesis landscape.

large language model, machine learning, natural language, (20 more...)

2504.09387

Country:

Asia (1.00)
Europe (0.94)
Africa (0.94)
North America > United States > Texas (0.24)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.94)

arXiv.org Machine LearningApr-14-2025

Kullback-Leibler excess risk bounds for exponential weighted aggregation in Generalized linear models

Mai, The Tien

Aggregation methods have emerged as a powerful and flexible framework in statistical learning, providing unified solutions across diverse problems such as regression, classification, and density estimation. In the context of generalized linear models (GLMs), where responses follow exponential family distributions, aggregation offers an attractive alternative to classical parametric modeling. This paper investigates the problem of sparse aggregation in GLMs, aiming to approximate the true parameter vector by a sparse linear combination of predictors. We prove that an exponential weighted aggregation scheme yields a sharp oracle inequality for the Kullback-Leibler risk with leading constant equal to one, while also attaining the minimax-optimal rate of aggregation. These results are further enhanced by establishing high-probability bounds on the excess risk.

aggregation, artificial intelligence, machine learning, (16 more...)

2504.10171

Country:

Europe > France (0.04)
North America > United States > New York (0.04)
North America > United States > District of Columbia > Washington (0.04)
Europe > Norway > Eastern Norway > Oslo (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.69)

Murfet, Daniel, Troiani, Will

Programs as Singularities

arXiv.org Artificial IntelligenceApr-14-2025

We develop a correspondence between the structure of Turing machines and the structure of singularities of real analytic functions, based on connecting the Ehrhard-Regnier derivative from linear logic with the role of geometry in Watanabe's singular learning theory. The correspondence works by embedding ordinary (discrete) Turing machine codes into a family of noisy codes which form a smooth parameter space. On this parameter space we consider a potential function which has Turing machines as critical points. By relating the Taylor series expansion of this potential at such a critical point to combinatorics of error syndromes, we relate the local geometry to internal structure of the Turing machine. The potential in question is the negative log-likelihood for a statistical model, so that the structure of the Turing machine and its associated singularity is further related to Bayesian inference. Two algorithms that produce the same predictive function can nonetheless correspond to singularities with different geometries, which implies that the Bayesian posterior can discriminate between distinct algorithmic implementations, contrary to a purely functional view of inference. In the context of singular learning theory our results point to a more nuanced understanding of Occam's razor and the meaning of simplicity in inductive inference.

artificial intelligence, description tape, machine learning, (18 more...)

2504.08075

Country: North America > United States (0.27)

Genre:

Workflow (0.67)
Research Report > New Finding (0.47)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.66)

arXiv.org Machine LearningApr-13-2025

Optimal sparse phase retrieval via a quasi-Bayesian approach

Mai, The Tien

This paper addresses the problem of sparse phase retrieval, a fundamental inverse problem in applied mathematics, physics, and engineering, where a signal need to be reconstructed using only the magnitude of its transformation while phase information remains inaccessible. Leveraging the inherent sparsity of many real-world signals, we introduce a novel sparse quasi-Bayesian approach and provide the first theoretical guarantees for such an approach. Specifically, we employ a scaled Student distribution as a continuous shrinkage prior to enforce sparsity and analyze the method using the PAC-Bayesian inequality framework. Our results establish that the proposed Bayesian estimator achieves minimax-optimal convergence rates under sub-exponential noise, matching those of state-of-the-art frequentist methods. To ensure computational feasibility, we develop an efficient Langevin Monte Carlo sampling algorithm. Through numerical experiments, we demonstrate that our method performs comparably to existing frequentist techniques, highlighting its potential as a principled alternative for sparse phase retrieval in noisy settings.

bayesian inference, machine learning, optimal sparse phase retrieval, (2 more...)

2504.09509

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.60)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.60)

Fan, Yefeng, White, Simon

Neural Posterior Estimation on Exponential Random Graph Models: Evaluating Bias and Implementation Challenges

arXiv.org Machine LearningApr-12-2025

Exponential random graph models (ERGMs) are flexible probabilistic frameworks to model statistical networks through a variety of network summary statistics. Conventional Bayesian estimation for ERGMs involves iteratively exchanging with an auxiliary variable due to the intractability of ERGMs, however, this approach lacks scalability to large-scale implementations. Neural posterior estimation (NPE) is a recent advancement in simulation-based inference, using a neural network based density estimator to infer the posterior for models with doubly intractable likelihoods for which simulations can be generated. While NPE has been successfully adopted in various fields such as cosmology, little research has investigated its use for ERGMs. Performing NPE on ERGM not only provides a differing angle of resolving estimation for the intractable ERGM likelihoods but also allows more efficient and scalable inference using the amortisation properties of NPE, and therefore, we investigate how NPE can be effectively implemented in ERGMs. In this study, we present the first systematic implementation of NPE for ERGMs, rigorously evaluating potential biases, interpreting the biases magnitudes, and comparing NPE fittings against conventional Bayesian ERGM fittings. More importantly, our work highlights ERGM-specific areas that may impose particular challenges for the adoption of NPE.

artificial intelligence, bayesian inference, machine learning, (18 more...)

2504.09349

Country: Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.28)

Genre: Research Report > New Finding (0.49)

Industry: Health & Medicine > Therapeutic Area > Neurology (0.93)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.89)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.88)
Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (0.71)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.66)