dir
Weighted Bayesian Conformal Prediction
Conformal prediction provides distribution-free prediction intervals with finite-sample coverage guarantees, and recent work by Snell \& Griffiths reframes it as Bayesian Quadrature (BQ-CP), yielding powerful data-conditional guarantees via Dirichlet posteriors over thresholds. However, BQ-CP fundamentally requires the i.i.d. assumption -- a limitation the authors themselves identify. Meanwhile, weighted conformal prediction handles distribution shift via importance weights but remains frequentist, producing only point-estimate thresholds. We propose \textbf{Weighted Bayesian Conformal Prediction (WBCP)}, which generalizes BQ-CP to arbitrary importance-weighted settings by replacing the uniform Dirichlet $\Dir(1,\ldots,1)$ with a weighted Dirichlet $\Dir(\neff \cdot \tilde{w}_1, \ldots, \neff \cdot \tilde{w}_n)$, where $\neff$ is Kish's effective sample size. We prove four theoretical results: (1)~$\neff$ is the unique concentration parameter matching frequentist and Bayesian variances; (2)~posterior standard deviation decays as $O(1/\sqrt{\neff})$; (3)~BQ-CP's stochastic dominance guarantee extends to per-weight-profile data-conditional guarantees; (4)~the HPD threshold provides $O(1/\sqrt{\neff})$ improvement in conditional coverage. We instantiate WBCP for spatial prediction as \emph{Geographical BQ-CP}, where kernel-based spatial weights yield per-location posteriors with interpretable diagnostics. Experiments on synthetic and real-world spatial datasets demonstrate that WBCP maintains coverage guarantees while providing substantially richer uncertainty information.
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
- Asia > Middle East > Jordan (0.04)
- Asia > China > Shanghai > Shanghai (0.04)
- North America > United States > Virginia (0.04)
- Asia > China > Ningxia Hui Autonomous Region > Yinchuan (0.04)
- Research Report > Experimental Study (0.93)
- Research Report > New Finding (0.67)
- North America > United States > California > Riverside County > Riverside (0.04)
- Europe > United Kingdom > England (0.04)
- Europe > Portugal > Coimbra > Coimbra (0.04)
- (2 more...)
- North America > Canada > Ontario > Toronto (0.14)
- Asia > China > Beijing > Beijing (0.04)
- North America > United States > Virginia (0.04)
- Research Report > Experimental Study (1.00)
- Research Report > New Finding (0.68)
- Health & Medicine (1.00)
- Information Technology > Security & Privacy (0.67)
- Research Report > New Finding (0.93)
- Research Report > Experimental Study (0.92)
- North America > United States (0.04)
- Asia > China > Guangdong Province > Guangzhou (0.04)
SeSE: A Structural Information-Guided Uncertainty Quantification Framework for Hallucination Detection in LLMs
Zhao, Xingtao, Peng, Hao, Su, Dingli, Zeng, Xianghua, Liu, Chunyang, Liao, Jinzhi, Yu, Philip S.
Reliable uncertainty quantification (UQ) is essential for deploying large language models (LLMs) in safety-critical scenarios, as it enables them to abstain from responding when uncertain, thereby avoiding ``hallucinating'' falsehoods. However, state-of-the-art UQ methods primarily rely on semantic probability distributions or pairwise distances, overlooking latent semantic structural information that could enable more precise uncertainty estimates. This paper presents Semantic Structural Entropy (SeSE), a principled UQ framework that quantifies the inherent semantic uncertainty of LLMs from a structural information perspective for hallucination detection. SeSE operates in a zero-resource manner and is applicable to both open- and closed-source LLMs, making it an ``off-the-shelf" solution for new models and tasks. Specifically, to effectively model semantic spaces, we first develop an adaptively sparsified directed semantic graph construction algorithm that captures directional semantic dependencies while automatically pruning unnecessary connections that introduce negative interference. We then exploit latent semantic structural information through hierarchical abstraction: SeSE is defined as the structural entropy of the optimal semantic encoding tree, formalizing intrinsic uncertainty within semantic spaces after optimal compression. A higher SeSE value corresponds to greater uncertainty, indicating that LLMs are highly likely to generate hallucinations. In addition, to enhance fine-grained UQ in long-form generation, we extend SeSE to quantify the uncertainty of individual claims by modeling their random semantic interactions, providing theoretically explicable hallucination detection. Extensive experiments across 29 model-dataset combinations show that SeSE significantly outperforms advanced UQ baselines.
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (0.92)
Stragglers Can Contribute More: Uncertainty-Aware Distillation for Asynchronous Federated Learning
Wang, Yujia, Ma, Fenglong, Chen, Jinghui
Asynchronous federated learning (FL) has recently gained attention for its enhanced efficiency and scalability, enabling local clients to send model updates to the server at their own pace without waiting for slower participants. However, such a design encounters significant challenges, such as the risk of outdated updates from straggler clients degrading the overall model performance and the potential bias introduced by faster clients dominating the learning process, especially under heterogeneous data distributions. Existing methods typically address only one of these issues, creating a conflict where mitigating the impact of outdated updates can exacerbate the bias created by faster clients, and vice versa. To address these challenges, we propose FedEcho, a novel framework that incorporates uncertainty-aware distillation to enhance the asynchronous FL performances under large asynchronous delays and data heterogeneity. Specifically, uncertainty-aware distillation enables the server to assess the reliability of predictions made by straggler clients, dynamically adjusting the influence of these predictions based on their estimated uncertainty. By prioritizing more certain predictions while still leveraging the diverse information from all clients, FedEcho effectively mitigates the negative impacts of outdated updates and data heterogeneity. Through extensive experiments, we demonstrate that FedEcho consistently outperforms existing asynchronous federated learning baselines, achieving robust performance without requiring access to private client data.
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)
- North America > United States > Florida > Alachua County > Gainesville (0.14)
- North America > Greenland (0.04)
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (0.93)
- Overview (0.67)