AITopics

Accurate intraday forecasts are essential for power system operations, complementing day-ahead forecasts that gradually lose relevance as new information becomes available. This paper introduces a Bayesian updating mechanism that converts fully probabilistic day-ahead forecasts into intraday forecasts without retraining or re-inference. The approach conditions the Gaussian mixture output of a conditional variational autoencoder-based forecaster on observed measurements, yielding an updated distribution for the remaining horizon that preserves its probabilistic structure. This enables consistent point, quantile, and ensemble forecasts while remaining computationally efficient and suitable for real-time applications. Experiments on household electricity consumption and photovoltaic generation datasets demonstrate that the proposed method improves forecast accuracy up to 25% across likelihood-, sample-, quantile-, and point-based metrics. The largest gains occur in time steps with strong temporal correlation to observed data, and the use of pattern dictionary-based covariance structures further enhances performance. The results highlight a theoretically grounded framework for intraday forecasting in modern power systems.

artificial intelligence, machine learning, real time system, (17 more...)

2510.12271

Country: Europe > Netherlands (0.28)

Genre: Research Report (0.64)

Industry:

Energy > Power Industry (1.00)
Energy > Renewable > Solar (0.88)

Technology:

Information Technology > Architecture > Real Time Systems (0.89)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Parsaee, Ali, Jiang, Bei, Friggstad, Zachary, Greiner, Russell

Budget-constrained Active Learning to Effectively De-censor Survival Data

Standard supervised learners attempt to learn a model from a labeled dataset. Given a small set of labeled instances, and a pool of unlabeled instances, a budgeted learner can use its given budget to pay to acquire the labels of some unlabeled instances, which it can then use to produce a model. Here, we explore budgeted learning in the context of survival datasets, which include (right) censored instances, where we know only a lower bound on an instance's time-to-event. Here, that learner can pay to (partially) label a censored instance -- e.g., to acquire the actual time for an instance [perhaps go from (3 yr, censored) to (7.2 yr, uncensored)], or other variants [e.g., learn about one more year, so go from (3 yr, censored) to either (4 yr, censored) or perhaps (3.2 yr, uncensored)]. This serves as a model of real world data collection, where follow-up with censored patients does not always lead to uncensoring, and how much information is given to the learner model during data collection is a function of the budget and the nature of the data itself. We provide both experimental and theoretical results for how to apply state-of-the-art budgeted learning algorithms to survival data and the respective limitations that exist in doing so. Our approach provides bounds and time complexity asymptotically equivalent to the standard active learning method BatchBALD. Moreover, empirical analysis on several survival tasks show that our model performs better than other potential approaches on several benchmarks.

artificial intelligence, machine learning, probe depth, (16 more...)

2510.12144

Country: North America > Canada > Alberta (0.28)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)
Overview (0.92)

Industry:

Law > Civil Rights & Constitutional Law (1.00)
Health & Medicine > Therapeutic Area > Oncology (0.93)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.67)

Chen, Yi-Chung, Inouye, David I., Gao, Jing

Your VAR Model is Secretly an Efficient and Explainable Generative Classifier

Generative classifiers, which leverage conditional generative models for classification, have recently demonstrated desirable properties such as robustness to distribution shifts. However, recent progress in this area has been largely driven by diffusion-based models, whose substantial computational cost severely limits scalability. This exclusive focus on diffusion-based methods has also constrained our understanding of generative classifiers. In this work, we propose a novel generative classifier built on recent advances in visual autoregressive (V AR) modeling, which offers a new perspective for studying generative classifiers. Moreover, we show that the V ARbased method exhibits fundamentally different properties from diffusion-based methods. In particular, due to its tractable likelihood, the V AR-based classifier enables visual explainability via token-wise mutual information and demonstrates inherent resistance to catastrophic forgetting in class-incremental learning tasks. Generative models are trained to directly capture the underlying data distribution of a given dataset, which enables a wide range of applications such as image generation (Han et al., 2025), image editing (Mu et al., 2025), and data augmentation (Trabucco et al., 2023). Given this expressive capability, a natural question arises: Can we leverage these powerful generative models for classification? This question has motivated a line of research on the "Generative Classifier."

classifier, machine learning, natural language, (18 more...)

2510.1206

Genre: Research Report > New Finding (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Kang, Sungmin, Bakman, Yavuz Faruk, Yaldiz, Duygu Nur, Buyukates, Baturalp, Avestimehr, Salman

Uncertainty Quantification for Hallucination Detection in Large Language Models: Foundations, Methodology, and Future Directions

The rapid advancement of large language models (LLMs) has transformed the landscape of natural language processing, enabling breakthroughs across a wide range of areas including question answering, machine translation, and text summarization. Yet, their deployment in real-world applications has raised concerns over reliability and trustworthiness, as LLMs remain prone to hallucinations that produce plausible but factually incorrect outputs. Uncertainty quantification (UQ) has emerged as a central research direction to address this issue, offering principled measures for assessing the trustworthiness of model generations. We begin by introducing the foundations of UQ, from its formal definition to the traditional distinction between epistemic and aleatoric uncertainty, and then highlight how these concepts have been adapted to the context of LLMs. Building on this, we examine the role of UQ in hallucination detection, where quantifying uncertainty provides a mechanism for identifying unreliable generations and improving reliability. We systematically categorize a wide spectrum of existing methods along multiple dimensions and present empirical results for several representative approaches. Finally, we discuss current limitations and outline promising future research directions, providing a clearer picture of the current landscape of LLM UQ for hallucination detection.

computational linguistic, large language model, machine learning, (17 more...)

2510.1204

Country:

Europe (1.00)
Asia (1.00)
North America > United States (0.93)

Genre:

Overview (1.00)
Research Report > New Finding (0.67)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.45)

arXiv.org Machine LearningOct-15-2025

The Robustness of Differentiable Causal Discovery in Misspecified Scenarios

Yi, Huiyang, He, Yanyan, Chen, Duxin, Kang, Mingyu, Wang, He, Yu, Wenwu

Causal discovery aims to learn causal relationships between variables from targeted data, making it a fundamental task in machine learning. However, causal discovery algorithms often rely on unverifiable causal assumptions, which are usually difficult to satisfy in real-world data, thereby limiting the broad application of causal discovery in practical scenarios. Inspired by these considerations, this work extensively benchmarks the empirical performance of various mainstream causal discovery algorithms, which assume i.i.d. data, under eight model assumption violations. Our experimental results show that differentiable causal discovery methods exhibit robustness under the metrics of Structural Hamming Distance and Structural Intervention Distance of the inferred graphs in commonly used challenging scenarios, except for scale variation. We also provide the theoretical explanations for the performance of differentiable causal discovery methods. Finally, our work aims to comprehensively benchmark the performance of recent differentiable causal discovery methods under model assumption violations, and provide the standard for reasonable evaluation of causal discovery, as well as to further promote its application in real-world scenarios.

artificial intelligence, bayesian inference, machine learning, (18 more...)

arXiv.org Machine Learning

2510.12503

Genre: Research Report > New Finding (0.47)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
(4 more...)

Atbir, Hind, Cherfaoui, Farah, Metzler, Guillaume, Morvant, Emilie, Viallard, Paul

PAC-Bayesian Bounds on Constrained f-Entropic Risk Measures

arXiv.org Machine LearningOct-14-2025

PAC generalization bounds on the risk, when expressed in terms of the expected loss, are often insufficient to capture imbalances between subgroups in the data. To overcome this limitation, we introduce a new family of risk measures, called constrained f-entropic risk measures, which enable finer control over distributional shifts and subgroup imbalances via f-divergences, and include the Conditional Value at Risk (CVaR), a well-known risk measure. We derive both classical and disintegrated PAC-Bayesian generalization bounds for this family of risks, providing the first disintegratedPAC-Bayesian guarantees beyond standard risks. Building on this theory, we design a self-bounding algorithm that minimizes our bounds directly, yielding models with guarantees at the subgroup level. Finally, we empirically demonstrate the usefulness of our approach.

artificial intelligence, bayesian inference, machine learning, (18 more...)

arXiv.org Machine Learning

2510.11169

Country:

North America > United States (0.04)
Europe > United Kingdom > England > Greater London > London (0.04)
Europe > France > Brittany > Ille-et-Vilaine > Rennes (0.04)
Asia > Japan (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.45)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.45)

arXiv.org Machine LearningOct-14-2025

Blade: A Derivative-free Bayesian Inversion Method using Diffusion Priors

Zheng, Hongkai, Wang, Austin, Wu, Zihui, Huang, Zhengyu, Baptista, Ricardo, Yue, Yisong

Derivative-free Bayesian inversion is an important task in many science and engineering applications, particularly when computing the forward model derivative is computationally and practically challenging. In this paper, we introduce Blade, which can produce accurate and well-calibrated posteriors for Bayesian inversion using an ensemble of interacting particles. Blade leverages powerful data-driven priors based on diffusion models, and can handle nonlinear forward models that permit only black-box access (i.e., derivative-free). Theoretically, we establish a non-asymptotic convergence analysis to characterize the effects of forward model and prior estimation errors. Empirically, Blade achieves superior performance compared to existing derivative-free Bayesian inversion methods on various inverse problems, including challenging highly nonlinear fluid dynamics.

artificial intelligence, inverse problem, machine learning, (18 more...)

arXiv.org Machine Learning

2510.10968

Country:

North America > United States > California (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > United Kingdom > England > Berkshire > Reading (0.04)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Vision (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)
(2 more...)

Mahmud, Abdullah Al, Chowdhury, Prangon, Uddin, Mohammed Borhan, Delowar, Khaled Eabne, Talha, Tausifur Rahman, Dewanjee, Bijoy

AI-Driven anemia diagnosis: A review of advanced models and techniques

arXiv.org Artificial IntelligenceOct-14-2025

Anemia, a condition marked by insufficient levels of red blood cells or hemoglobin, remains a widespread health issue affecting millions of individuals globally. Accurate and timely diagnosis is essential for effective management and treatment of anemia. In recent years, there has been a growing interest in the use of artificial intelligence techniques, i.e., machine learning (ML) and deep learning (DL) for the detection, classification, and diagnosis of anemia. This paper provides a systematic review of the recent advancements in this field, with a focus on various models applied to anemia detection. The review also compares these models based on several performance metrics, including accuracy, sensitivity, specificity, and precision. By analyzing these metrics, the paper evaluates the strengths and limitation of discussed models in detecting and classifying anemia, emphasizing the importance of addressing these factors to improve diagnostic accuracy.

artificial intelligence, fuzzy logic, machine learning, (19 more...)

2510.1138

Country:

Africa (0.93)
Asia > India (0.28)
North America > United States (0.28)
Asia > Middle East (0.28)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)
Overview (1.00)

Industry:

Health & Medicine > Therapeutic Area > Oncology (1.00)
Health & Medicine > Therapeutic Area > Hematology (1.00)
Health & Medicine > Therapeutic Area > Endocrinology > Diabetes (0.95)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Fuzzy Logic (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
(3 more...)

Macuglia, Maël, Friedrich, Paul, Ramponi, Giorgia

Fine-tuning Behavioral Cloning Policies with Preference-Based Reinforcement Learning

arXiv.org Artificial IntelligenceOct-14-2025

Deploying reinforcement learning (RL) in robotics, industry, and health care is blocked by two obstacles: the difficulty of specifying accurate rewards and the risk of unsafe, data-hungry exploration. We address this by proposing a two-stage framework that first learns a safe initial policy from a reward-free dataset of expert demonstrations, then fine-tunes it online using preference-based human feedback. We provide the first principled analysis of this offline-to-online approach and introduce BRIDGE, a unified algorithm that integrates both signals via an uncertainty-weighted objective. We derive regret bounds that shrink with the number of offline demonstrations, explicitly connecting the quantity of offline data to online sample efficiency. We validate BRIDGE in discrete and continuous control MuJoCo environments, showing it achieves lower regret than both standalone behavioral cloning and online preference-based RL. Our work establishes a theoretical foundation for designing more sample-efficient interactive agents.

artificial intelligence, machine learning, reinforcement learning, (19 more...)

2509.26605

Country: Europe > Switzerland (0.28)

Genre: Research Report > New Finding (0.45)

Industry:

Education (0.68)
Health & Medicine (0.47)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

arXiv.org Artificial IntelligenceOct-14-2025

Personalized Bayesian Federated Learning with Wasserstein Barycenter Aggregation

Wei, Ting, Mei, Biao, Lyu, Junliang, Zhang, Renquan, Zhou, Feng, Sun, Yifan

Personalized Bayesian federated learning (PBFL) handles non-i.i.d. client data and quantifies uncertainty by combining personalization with Bayesian inference. However, existing PBFL methods face two limitations: restrictive parametric assumptions in client posterior inference and naive parameter averaging for server aggregation. To overcome these issues, we propose FedWBA, a novel PBFL method that enhances both local inference and global aggregation. At the client level, we use particle-based variational inference for nonparametric posterior representation. At the server level, we introduce particle-based Wasserstein barycenter aggregation, offering a more geometrically meaningful approach. Theoretically, we provide local and global convergence guarantees for FedWBA. Locally, we prove a KL divergence decrease lower bound per iteration for variational inference convergence. Globally, we show that the Wasserstein barycenter converges to the true parameter as the client data size increases. Empirically, experiments show that FedWBA outperforms baselines in prediction accuracy, uncertainty calibration, and convergence rate, with ablation studies confirming its robustness.

artificial intelligence, machine learning, particle, (16 more...)

2505.14161

Country: Asia > China (0.29)

Genre: Research Report > Experimental Study (1.00)

Industry: Information Technology (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)