Uncertainty
Deep Graph Neural Point Process For Learning Temporal Interactive Networks
Chen, Su, Qi, Xiaohua, Lin, Xixun, Shang, Yanmin, Xu, Xiaolin, Li, Yangxi
Learning temporal interaction networks(TIN) is previously regarded as a coarse-grained multi-sequence prediction problem, ignoring the network topology structure influence. This paper addresses this limitation and a Deep Graph Neural Point Process(DGNPP) model for TIN is proposed. DGNPP consists of two key modules: the Node Aggregation Layer and the Self Attentive Layer. The Node Aggregation Layer captures topological structures to generate static representation for users and items, while the Self Attentive Layer dynamically updates embeddings over time. By incorporating both dynamic and static embeddings into the event intensity function and optimizing the model via maximum likelihood estimation, DGNPP predicts events and occurrence time effectively. Experimental evaluations on three public datasets demonstrate that DGNPP achieves superior performance in event prediction and time prediction tasks with high efficiency, significantly outperforming baseline models and effectively mitigating the limitations of prior approaches.
Modeling GRNs with a Probabilistic Categorical Framework
Jia, Yiyang, Wei, Zheng, Yang, Zheng, Peng, Guohong
Understanding the complex and stochastic nature of Gene Regulatory Networks (GRNs) remains a central challenge in systems biology. Existing modeling paradigms often struggle to effectively capture the intricate, multi-factor regulatory logic and to rigorously manage the dual uncertainties of network structure and kinetic parameters. In response, this work introduces the Probabilistic Categorical GRN(PC-GRN) framework. It is a novel theoretical approach founded on the synergistic integration of three core methodologies. Firstly, category theory provides a formal language for the modularity and composition of regulatory pathways. Secondly, Bayesian Typed Petri Nets (BTPNs) serve as an interpretable,mechanistic substrate for modeling stochastic cellular processes, with kinetic parameters themselves represented as probability distributions. The central innovation of PC-GRN is its end-to-end generative Bayesian inference engine, which learns a full posterior distribution over BTPN models (P (G, Θ|D)) directly from data. This is achieved by the novel interplay of a GFlowNet, which learns a policy to sample network topologies, and a HyperNetwork, which performs amortized inference to predict their corresponding parameter distributions. The resulting framework provides a mathematically rigorous, biologically interpretable, and uncertainty-aware representation of GRNs, advancing predictive modeling and systems-level analysis.
Enhancing Visual Reliance in Text Generation: A Bayesian Perspective on Mitigating Hallucination in Large Vision-Language Models
Hu, Nanxing, Duan, Xiaoyue, Zhang, Jinchao, Kang, Guoliang
Large Vision-Language Models (LVLMs) usually generate texts which satisfy context coherence but don't match the visual input. Such a hallucination issue hinders LVLMs' applicability in the real world. The key to solving hallucination in LVLM is to make the text generation rely more on the visual content. Most previous works choose to enhance/adjust the features/output of a specific modality (i.e., visual or textual) to alleviate hallucinations in LVLM, which do not explicitly or systematically enhance the visual reliance. In this paper, we comprehensively investigate the factors which may degenerate the visual reliance in text generation of LVLM from a Bayesian perspective. Based on our observations, we propose to mitigate hallucination in LVLM from three aspects. Firstly, we observe that not all visual tokens are informative in generating meaningful texts. We propose to evaluate and remove redundant visual tokens to avoid their disturbance. Secondly, LVLM may encode inappropriate prior information, making it lean toward generating unexpected words. We propose a simple yet effective way to rectify the prior from a Bayesian perspective. Thirdly, we observe that starting from certain steps, the posterior of next-token prediction conditioned on visual tokens may collapse to a prior distribution which does not depend on any informative visual tokens at all. Thus, we propose to stop further text generation to avoid hallucination. Extensive experiments on three benchmarks including POPE, CHAIR, and MME demonstrate that our method can consistently mitigate the hallucination issue of LVLM and performs favorably against previous state-of-the-arts.
Epistemic Wrapping for Uncertainty Quantification
Sultana, Maryam, Yorke-Smith, Neil, Wang, Kaizheng, Manchingal, Shireen Kudukkil, Mubashar, Muhammad, Cuzzolin, Fabio
Uncertainty estimation is pivotal in machine learning, especially for classification tasks, as it improves the robustness and reliability of models. We introduce a novel `Epistemic Wrapping' methodology aimed at improving uncertainty estimation in classification. Our approach uses Bayesian Neural Networks (BNNs) as a baseline and transforms their outputs into belief function posteriors, effectively capturing epistemic uncertainty and offering an efficient and general methodology for uncertainty quantification. Comprehensive experiments employing a Bayesian Neural Network (BNN) baseline and an Interval Neural Network for inference on the MNIST, Fashion-MNIST, CIFAR-10 and CIFAR-100 datasets demonstrate that our Epistemic Wrapper significantly enhances generalisation and uncertainty quantification.
In-Context Decision Making for Optimizing Complex AutoML Pipelines
Balef, Amir Rezaei, Eggensperger, Katharina
Combined Algorithm Selection and Hyperparameter Optimization (CASH) has been fundamental to traditional AutoML systems. However, with the advancements of pre-trained models, modern ML workflows go beyond hyperparameter optimization and often require fine-tuning, ensembling, and other adaptation techniques. While the core challenge of identifying the best-performing model for a downstream task remains, the increasing heterogeneity of ML pipelines demands novel AutoML approaches. This work extends the CASH framework to select and adapt modern ML pipelines. We propose PS-PFN to efficiently explore and exploit adapting ML pipelines by extending Posterior Sampling (PS) to the max k -armed bandit problem setup. PS-PFN leverages prior-data fitted networks (PFNs) to efficiently estimate the posterior distribution of the maximal value via in-context learning. We show how to extend this method to consider varying costs of pulling arms and to use different PFNs to model reward distributions individually per arm. Experimental results on one novel and two existing standard benchmark tasks demonstrate the superior performance of PS-PFN compared to other bandit and AutoML strategies.
Bounding Causal Effects and Counterfactuals
Causal inference often hinges on strong assumptions - such as no unmeasured confounding or perfect compliance - that are rarely satisfied in practice. Partial identification offers a principled alternative: instead of relying on unverifiable assumptions to estimate causal effects precisely, it derives bounds that reflect the uncertainty inherent in the data. Despite its theoretical appeal, partial identification remains underutilized in applied work, in part due to the fragmented nature of existing methods and the lack of practical guidance. This thesis addresses these challenges by systematically comparing a diverse set of bounding algorithms across multiple causal scenarios. We implement, extend, and unify state-of-the-art methods - including symbolic, optimization-based, and information-theoretic approaches - within a common evaluation framework. In particular, we propose an extension of a recently introduced entropy-bounded method, making it applicable to counterfactual queries such as the Probability of Necessity and Sufficiency (PNS). Our empirical study spans thousands of randomized simulations involving both discrete and continuous data-generating processes. We assess each method in terms of bound tightness, computational efficiency, and robustness to assumption violations. To support practitioners, we distill our findings into a practical decision tree for algorithm selection and train a machine learning model to predict the best-performing method based on observable data characteristics. All implementations are released as part of an open-source Python package, CausalBoundingEngine, which enables users to apply and compare bounding methods through a unified interface.
Approximate Bayesian Inference via Bitstring Representations
Sladek, Aleksanteri, Trapp, Martin, Solin, Arno
The machine learning community has recently put effort into quantized or low-precision arithmetics to scale large models. This paper proposes performing probabilistic inference in the quantized, discrete parameter space created by these representations, effectively enabling us to learn a continuous distribution using discrete parameters. We consider both 2D densities and quantized neural networks, where we introduce a tractable learning approach using probabilistic circuits. This method offers a scalable solution to manage complex distributions and provides clear insights into model behavior. We validate our approach with various models, demonstrating inference efficiency without sacrificing accuracy. This work advances scalable, interpretable machine learning by utilizing discrete approximations for probabilistic computations.
Dimension lower bounds for linear approaches to function approximation
This short note presents a linear algebraic approach to proving dimension lower bounds for linear methods that solve $L^2$ function approximation problems. The basic argument has appeared in the literature before (e.g., Barron, 1993) for establishing lower bounds on Kolmogorov $n$-widths. The argument is applied to give sample size lower bounds for kernel methods.
The Course Difficulty Analysis Cookbook
Baucks, Frederik, Schmucker, Robin, Wiskott, Laurenz
Curriculum analytics (CA) studies curriculum structure and student data to ensure the quality of educational programs. An essential aspect is studying course properties, which involves assigning each course a representative difficulty value. This is critical for several aspects of CA, such as quality control (e.g., monitoring variations over time), course comparisons (e.g., articulation), and course recommendation (e.g., advising). Measuring course difficulty requires careful consideration of multiple factors: First, when difficulty measures are sensitive to the performance level of enrolled students, it can bias interpretations by overlooking student diversity. By assessing difficulty independently of enrolled students' performances, we can reduce the risk of bias and enable fair, representative assessments of difficulty. Second, from a measurement theoretic perspective, the measurement must be reliable and valid to provide a robust basis for subsequent analyses. Third, difficulty measures should account for covariates, such as the characteristics of individual students within a diverse populations (e.g., transfer status). In recent years, various notions of difficulty have been proposed. This paper provides the first comprehensive review and comparison of existing approaches for assessing course difficulty based on grade point averages and latent trait modeling. It further offers a hands-on tutorial on model selection, assumption checking, and practical CA applications. These applications include monitoring course difficulty over time and detecting courses with disparate outcomes between distinct groups of students (e.g., dropouts vs. graduates), ultimately aiming to promote high-quality, fair, and equitable learning experiences. To support further research and application, we provide an open-source software package and artificial datasets, facilitating reproducibility and adoption.