AITopics

2504.06372

Country:

Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
Europe > Sweden > Stockholm > Stockholm (0.04)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Huang, Jundi, Zhan, Dawei

An Adaptive Dropout Approach for High-Dimensional Bayesian Optimization

arXiv.org Machine LearningApr-15-2025

Bayesian optimization (BO) is a widely used algorithm for solving expensive black-box optimization problems. However, its performance decreases significantly on high-dimensional problems due to the inherent high-dimensionality of the acquisition function. In the proposed algorithm, we adaptively dropout the variables of the acquisition function along the iterations. By gradually reducing the dimension of the acquisition function, the proposed approach has less and less difficulty to optimize the acquisition function. Numerical experiments demonstrate that AdaDropout effectively tackle high-dimensional challenges and improve solution quality where standard Bayesian optimization methods often struggle. Moreover, it achieves superior results when compared with state-of-the-art high-dimensional Bayesian optimization approaches. This work provides a simple yet efficient solution for high-dimensional expensive optimization.

artificial intelligence, machine learning, optimization, (19 more...)

2504.11353

Country:

Asia > China > Henan Province > Zhengzhou (0.04)
Asia > Singapore (0.04)
Asia > Myanmar > Tanintharyi Region > Dawei (0.04)
(3 more...)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

CLEAR-KGQA: Clarification-Enhanced Ambiguity Resolution for Knowledge Graph Question Answering

Wen, Liqiang, Xiong, Guanming, Mo, Tong, Li, Bing, Li, Weiping, Zhao, Wen

This study addresses the challenge of ambiguity in knowledge graph question answering (KGQA). While recent KGQA systems have made significant progress, particularly with the integration of large language models (LLMs), they typically assume user queries are unambiguous, which is an assumption that rarely holds in real-world applications. To address these limitations, we propose a novel framework that dynamically handles both entity ambiguity (e.g., distinguishing between entities with similar names) and intent ambiguity (e.g., clarifying different interpretations of user queries) through interactive clarification. Our approach employs a Bayesian inference mechanism to quantify query ambiguity and guide LLMs in determining when and how to request clarification from users within a multi-turn dialogue framework. We further develop a two-agent interaction framework where an LLM-based user simulator enables iterative refinement of logical forms through simulated user feedback. Experimental results on the WebQSP and CWQ dataset demonstrate that our method significantly improves performance by effectively resolving semantic ambiguities. Additionally, we contribute a refined dataset of disambiguated queries, derived from interaction histories, to facilitate future research in this direction.

ambiguity, large language model, machine learning, (19 more...)

2504.09665

Country:

North America > United States (0.46)
Asia > China (0.30)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Shaposhnyk, Olha, Zahorska, Daria, Yanushkevich, Svetlana

Can LLMs Assist Expert Elicitation for Probabilistic Causal Modeling?

Objective: This study investigates the potential of Large Language Models (LLMs) as an alternative to human expert elicitation for extracting structured causal knowledge and facilitating causal modeling in biometric and healthcare applications. Material and Methods: LLM-generated causal structures, specifically Bayesian networks (BNs), were benchmarked against traditional statistical methods (e.g., Bayesian Information Criterion) using healthcare datasets. Validation techniques included structural equation modeling (SEM) to verifying relationships, and measures such as entropy, predictive accuracy, and robustness to compare network structures. Results and Discussion: LLM-generated BNs demonstrated lower entropy than expert-elicited and statistically generated BNs, suggesting higher confidence and precision in predictions. However, limitations such as contextual constraints, hallucinated dependencies, and potential biases inherited from training data require further investigation. Conclusion: LLMs represent a novel frontier in expert elicitation for probabilistic causal modeling, promising to improve transparency and reduce uncertainty in the decision-making using such models.

large language model, machine learning, natural language, (19 more...)

2504.10397

Country: North America > Canada (0.28)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (0.69)

Industry:

Health & Medicine > Therapeutic Area (1.00)
Health & Medicine > Consumer Health (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Constantinou, Anthony C., Higgins, Nicholas, Kitson, Neville K.

Decoding the mechanisms of the Hattrick football manager game using Bayesian network structure learning for optimal decision-making

Hattrick is a free web-based probabilistic football manager game with over 200,000 users competing for titles at national and international levels. Launched in Sweden in 1997 as part of an MSc project, the game's slow-paced design has fostered a loyal community, with many users remaining active for decades. Hattrick's game-engine mechanics are partially hidden, and users have attempted to decode them with incremental success over the years. Rule-based, statistical and machine learning models have been developed to aid this effort and are widely used by the community. However, these models or tools have not been formally described or evaluated in the scientific literature. This study is the first to explore Hattrick using structure learning techniques and Bayesian networks, integrating both data and domain knowledge to develop models capable of explaining and simulating the game engine. We present a comprehensive analysis assessing the effectiveness of structure learning algorithms in relation to knowledge-based structures, and show that while structure learning may achieve a higher overall network fit, it does not result in more accurate predictions for selected variables of interest, when compared to knowledge-based networks that produce a lower overall network fit. Additionally, we introduce and publicly share a fully specified Bayesian network model that matches the performance of top models used by the Hattrick community. We further demonstrate how analysis extends beyond prediction by providing a visual representation of conditional dependencies, and using the best performing Bayesian network model for in-game decision-making. To support future research, we make all data, graphical structures, and models publicly available online.

artificial intelligence, bayesian inference, machine learning, (20 more...)

2504.09499

Country:

Europe > United Kingdom > England (0.45)
North America > United States (0.45)

Genre: Research Report > New Finding (1.00)

Industry:

Leisure & Entertainment > Sports > Soccer (1.00)
Leisure & Entertainment > Games > Computer Games (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Padmanabhan, Sriram, Misra, Kanishka, Mahowald, Kyle, Choi, Eunsol

On Language Models' Sensitivity to Suspicious Coincidences

Humans are sensitive to suspicious coincidences when generalizing inductively over data, as they make assumptions as to how the data was sampled. This results in smaller, more specific hypotheses being favored over more general ones. For instance, when provided the set {Austin, Dallas, Houston}, one is more likely to think that this is sampled from "Texas Cities" over "US Cities" even though both are compatible. Suspicious coincidence is strongly connected to pragmatic reasoning, and can serve as a testbed to analyze systems on their sensitivity towards the communicative goals of the task (i.e., figuring out the true category underlying the data). In this paper, we analyze whether suspicious coincidence effects are reflected in language models' (LMs) behavior. We do so in the context of two domains: 1) the number game, where humans made judgments of whether a number (e.g., 4) fits a list of given numbers (e.g., 16, 32, 2); and 2) by extending the number game setup to prominent cities. For both domains, the data is compatible with multiple hypotheses and we study which hypothesis is most consistent with the models' behavior. On analyzing five models, we do not find strong evidence for suspicious coincidences in LMs' zero-shot behavior. However, when provided access to the hypotheses space via chain-of-thought or explicit prompting, LMs start to show an effect resembling suspicious coincidences, sometimes even showing effects consistent with humans. Our study suggests that inductive reasoning behavior in LMs can be enhanced with explicit access to the hypothesis landscape.

large language model, machine learning, natural language, (20 more...)

2504.09387

Country:

Asia (1.00)
Europe (0.94)
Africa (0.94)
North America > United States > Texas (0.24)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.94)

Shabar, Nour M., Saber, Ahmad Mohammad, Kundur, Deepa

Machine Learning-Based Cyberattack Detection and Identification for Automatic Generation Control Systems Considering Nonlinearities

Automatic generation control (AGC) systems play a crucial role in maintaining system frequency across power grids. However, AGC systems' reliance on communicated measurements exposes them to false data injection attacks (FDIAs), which can compromise the overall system stability. This paper proposes a machine learning (ML)-based detection framework that identifies FDIAs and determines the compromised measurements. The approach utilizes an ML model trained offline to accurately detect attacks and classify the manipulated signals based on a comprehensive set of statistical and time-series features extracted from AGC measurements before and after disturbances. For the proposed approach, we compare the performance of several powerful ML algorithms. Our results demonstrate the efficacy of the proposed method in detecting FDIAs while maintaining a low false alarm rate, with an F1-score of up to 99.98%, outperforming existing approaches.

agc system, artificial intelligence, machine learning, (16 more...)

2504.09363

Country:

Asia > Middle East > UAE (0.28)
North America > Canada > Ontario > Toronto (0.14)

Genre: Research Report > New Finding (0.69)

Industry:

Information Technology > Security & Privacy (1.00)
Energy > Power Industry (1.00)
Government > Military > Cyberwarfare (0.67)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.71)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.47)

arXiv.org Machine LearningApr-14-2025

Kullback-Leibler excess risk bounds for exponential weighted aggregation in Generalized linear models

Mai, The Tien

Aggregation methods have emerged as a powerful and flexible framework in statistical learning, providing unified solutions across diverse problems such as regression, classification, and density estimation. In the context of generalized linear models (GLMs), where responses follow exponential family distributions, aggregation offers an attractive alternative to classical parametric modeling. This paper investigates the problem of sparse aggregation in GLMs, aiming to approximate the true parameter vector by a sparse linear combination of predictors. We prove that an exponential weighted aggregation scheme yields a sharp oracle inequality for the Kullback-Leibler risk with leading constant equal to one, while also attaining the minimax-optimal rate of aggregation. These results are further enhanced by establishing high-probability bounds on the excess risk.

aggregation, artificial intelligence, machine learning, (16 more...)

2504.10171

Country:

Europe > France (0.04)
North America > United States > New York (0.04)
North America > United States > District of Columbia > Washington (0.04)
Europe > Norway > Eastern Norway > Oslo (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.69)

arXiv.org Artificial IntelligenceApr-14-2025

Towards Responsible and Trustworthy Educational Data Mining: Comparing Symbolic, Sub-Symbolic, and Neural-Symbolic AI Methods

Hooshyar, Danial, Kikas, Eve, Yang, Yeongwook, Šír, Gustav, Hämäläinen, Raija, Kärkkäinen, Tommi, Azevedo, Roger

Given the demand for responsible and trustworthy AI for education, this study evaluates symbolic, sub-symbolic, and neural-symbolic AI (NSAI) in terms of generalizability and interpretability. Our extensive experiments on balanced and imbalanced self-regulated learning datasets of Estonian primary school students predicting 7th-grade mathematics national test performance showed that symbolic and sub-symbolic methods performed well on balanced data but struggled to identify low performers in imbalanced datasets. Interestingly, symbolic and sub-symbolic methods emphasized different factors in their decision-making: symbolic approaches primarily relied on cognitive and motivational factors, while sub-symbolic methods focused more on cognitive aspects, learnt knowledge, and the demographic variable of gender -- yet both largely overlooked metacognitive factors. The NSAI method, on the other hand, showed advantages by: (i) being more generalizable across both classes -- even in imbalanced datasets -- as its symbolic knowledge component compensated for the underrepresented class; and (ii) relying on a more integrated set of factors in its decision-making, including motivation, (meta)cognition, and learnt knowledge, thus offering a comprehensive and theoretically grounded interpretability framework. These contrasting findings highlight the need for a holistic comparison of AI methods before drawing conclusions based solely on predictive performance. They also underscore the potential of hybrid, human-centred NSAI methods to address the limitations of other AI families and move us closer to responsible AI for education. Specifically, by enabling stakeholders to contribute to AI design, NSAI aligns learned patterns with theoretical constructs, incorporates factors like motivation and metacognition, and strengthens the trustworthiness and responsibility of educational data mining.

data mining, knowledge management, machine learning, (22 more...)

2504.00615

Country:

Europe > Estonia (0.28)
North America > United States (0.28)

Genre:

Research Report > New Finding (1.00)
Instructional Material (1.00)

Industry:

Law (1.00)
Education > Assessment & Standards (1.00)
Education > Curriculum > Subject-Specific Education (0.68)
(2 more...)

Technology:

Information Technology > Knowledge Management > Knowledge Engineering (1.00)
Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Rule-Based Reasoning (1.00)
(7 more...)

arXiv.org Machine LearningApr-13-2025

Optimal sparse phase retrieval via a quasi-Bayesian approach

Mai, The Tien

This paper addresses the problem of sparse phase retrieval, a fundamental inverse problem in applied mathematics, physics, and engineering, where a signal need to be reconstructed using only the magnitude of its transformation while phase information remains inaccessible. Leveraging the inherent sparsity of many real-world signals, we introduce a novel sparse quasi-Bayesian approach and provide the first theoretical guarantees for such an approach. Specifically, we employ a scaled Student distribution as a continuous shrinkage prior to enforce sparsity and analyze the method using the PAC-Bayesian inequality framework. Our results establish that the proposed Bayesian estimator achieves minimax-optimal convergence rates under sub-exponential noise, matching those of state-of-the-art frequentist methods. To ensure computational feasibility, we develop an efficient Langevin Monte Carlo sampling algorithm. Through numerical experiments, we demonstrate that our method performs comparably to existing frequentist techniques, highlighting its potential as a principled alternative for sparse phase retrieval in noisy settings.

bayesian inference, machine learning, optimal sparse phase retrieval, (2 more...)

2504.09509

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.60)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.60)