Goto

Collaborating Authors

 Bayesian Learning


Bayesian Semi-supervised Inference via a Debiased Modeling Approach

arXiv.org Machine Learning

Inference in semi-supervised (SS) settings has gained substantial attention in recent years due to increased relevance in modern big-data problems. In a typical SS setting, there is a much larger-sized unlabeled data, containing only observations of predictors, and a moderately sized labeled data containing observations for both an outcome and the set of predictors. Such data naturally arises when the outcome, unlike the predictors, is costly or difficult to obtain. One of the primary statistical objectives in SS settings is to explore whether parameter estimation can be improved by exploiting the unlabeled data. We propose a novel Bayesian method for estimating the population mean in SS settings. The approach yields estimators that are both efficient and optimal for estimation and inference. The method itself has several interesting artifacts. The central idea behind the method is to model certain summary statistics of the data in a targeted manner, rather than the entire raw data itself, along with a novel Bayesian notion of debiasing. Specifying appropriate summary statistics crucially relies on a debiased representation of the population mean that incorporates unlabeled data through a flexible nuisance function while also learning its estimation bias. Combined with careful usage of sample splitting, this debiasing approach mitigates the effect of bias due to slow rates or misspecification of the nuisance parameter from the posterior of the final parameter of interest, ensuring its robustness and efficiency. Concrete theoretical results, via Bernstein--von Mises theorems, are established, validating all claims, and are further supported through extensive numerical studies. To our knowledge, this is possibly the first work on Bayesian inference in SS settings, and its central ideas also apply more broadly to other Bayesian semi-parametric inference problems.


On Quantification of Borrowing of Information in Hierarchical Bayesian Models

arXiv.org Machine Learning

In this work, we offer a thorough analytical investigation into the role of shared hyperparameters in a hierarchical Bayesian model, examining their impact on information borrowing and posterior inference. Our approach is rooted in a non-asymptotic framework, where observations are drawn from a mixed-effects model, and a Gaussian distribution is assumed for the true effect generator. We consider a nested hierarchical prior distribution model to capture these effects and use the posterior means for Bayesian estimation. To quantify the effect of information borrowing, we propose an integrated risk measure relative to the true data-generating distribution. Our analysis reveals that the Bayes estimator for the model with a deeper hierarchy performs better, provided that the unknown random effects are correlated through a compound symmetric structure. Our work also identifies necessary and sufficient conditions for this model to outperform the one nested within it. We further obtain sufficient conditions when the correlation is perturbed. Our study suggests that the model with a deeper hierarchy tends to outperform the nested model unless the true data-generating distribution favors sufficiently independent groups. These findings have significant implications for Bayesian modeling, and we believe they will be of interest to researchers across a wide range of fields.


Medical priority fusion: achieving dual optimization of sensitivity and interpretability in nipt anomaly detection

arXiv.org Artificial Intelligence

Clinical machine learning faces a critical dilemma in high-stakes medical applications: algorithms achieving optimal diagnostic performance typically sacrifice the interpretability essential for physician decision-making, while interpretable methods compromise sensitivity in complex scenarios. This paradox becomes particularly acute in non-invasive prenatal testing (NIPT), where missed chromosomal abnormalities carry profound clinical consequences yet regulatory frameworks mandate explainable AI systems. We introduce Medical Priority Fusion (MPF), a constrained multi-objective optimization framework that resolves this fundamental trade-off by systematically integrating Naive Bayes probabilistic reasoning with Decision Tree rule-based logic through mathematically-principled weighted fusion under explicit medical constraints. Rigorous validation on 1,687 real-world NIPT samples characterized by extreme class imbalance (43.4:1 normal-to-abnormal ratio) employed stratified 5-fold cross-validation with comprehensive ablation studies and statistical hypothesis testing using McNemar's paired comparisons. MPF achieved simultaneous optimization of dual objectives: 89.3% sensitivity (95% CI: 83.9-94.7%) with 80% interpretability score, significantly outperforming individual algorithms (McNemar's test, p < 0.001). The optimal fusion configuration achieved Grade A clinical deployment criteria with large effect size (d = 1.24), establishing the first clinically-deployable solution that maintains both diagnostic accuracy and decision transparency essential for prenatal care. This work demonstrates that medical-constrained algorithm fusion can resolve the interpretability-performance trade-off, providing a mathematical framework for developing high-stakes medical decision support systems that meet both clinical efficacy and explainability requirements.


WISE: Weak-Supervision-Guided Step-by-Step Explanations for Multimodal LLMs in Image Classification

arXiv.org Artificial Intelligence

Multimodal Large Language Models (MLLMs) have shown promise in visual-textual reasoning, with Multimodal Chain-of-Thought (MCoT) prompting significantly enhancing interpretability. However, existing MCoT methods rely on rationale-rich datasets and largely focus on inter-object reasoning, overlooking the intra-object understanding crucial for image classification. To address this gap, we propose WISE, a Weak-supervision-guided Step-by-step Explanation method that augments any image classification dataset with MCoTs by reformulating the concept-based representations from Concept Bottleneck Models (CBMs) into concise, interpretable reasoning chains under weak supervision. Experiments across ten datasets show that our generated MCoTs not only improve interpretability by 37% but also lead to gains in classification accuracy when used to fine-tune MLLMs. Our work bridges concept-based interpretability and generative MCoT reasoning, providing a generalizable framework for enhancing MLLMs in fine-grained visual understanding.


Comparing Data Assimilation and Likelihood-Based Inference on Latent State Estimation in Agent-Based Models

arXiv.org Artificial Intelligence

In this paper, we present the first systematic comparison of Data Assimilation (DA) and Likelihood-Based Inference (LBI) in the context of Agent-Based Models (ABMs). These models generate observable time series driven by evolving, partially-latent microstates. Latent states need to be estimated to align simulations with real-world data -- a task traditionally addressed by DA, especially in continuous and equation-based models such as those used in weather forecasting. However, the nature of ABMs poses challenges for standard DA methods. Solving such issues requires adaptation of previous DA techniques, or ad-hoc alternatives such as LBI. DA approximates the likelihood in a model-agnostic way, making it broadly applicable but potentially less precise. In contrast, LBI provides more accurate state estimation by directly leveraging the model's likelihood, but at the cost of requiring a hand-crafted, model-specific likelihood function, which may be complex or infeasible to derive. We compare the two methods on the Bounded-Confidence Model, a well-known opinion dynamics ABM, where agents are affected only by others holding sufficiently similar opinions. We find that LBI better recovers latent agent-level opinions, even under model mis-specification, leading to improved individual-level forecasts. At the aggregate level, however, both methods perform comparably, and DA remains competitive across levels of aggregation under certain parameter settings. Our findings suggest that DA is well-suited for aggregate predictions, while LBI is preferable for agent-level inference.


Conditional Policy Generator for Dynamic Constraint Satisfaction and Optimization

arXiv.org Artificial Intelligence

Leveraging machine learning methods to solve constraint satisfaction problems has shown promising, but they are mostly limited to a static situation where the problem description is completely known and fixed from the beginning. In this work we present a new approach to constraint satisfaction and optimization in dynamically changing environments, particularly when variables in the problem are statistically independent. We frame it as a reinforcement learning problem and introduce a conditional policy generator by borrowing the idea of class conditional generative adversarial networks (GANs). Assuming that the problem includes both static and dynamic constraints, the former are used in a reward formulation to guide the policy training such that it learns to map to a probabilistic distribution of solutions satisfying static constraints from a noise prior, which is similar to a generator in GANs. On the other hand, dynamic constraints in the problem are encoded to different class labels and fed with the input noise. The policy is then simultaneously updated for maximum likelihood of correctly classifying given the dynamic conditions in a supervised manner. We empirically demonstrate a proof-of-principle experiment with a multi-modal constraint satisfaction problem and compare between unconditional and conditional cases.


Ultra-short-term solar power forecasting by deep learning and data reconstruction

arXiv.org Artificial Intelligence

The integration of solar power has been increasing as the green energy transition rolls out. The penetration of solar power challenges the grid stability and energy scheduling, due to its intermittent energy generation. Accurate and near real-time solar power prediction is of critical importance to tolerant and support the permeation of distributed and volatile solar power production in the energy system. In this paper, we propose a deep-learning based ultra-short-term solar power prediction with data reconstruction. We decompose the data for the prediction to facilitate extensive exploration of the spatial and temporal dependencies within the data. Particularly, we reconstruct the data into low- and high-frequency components, using ensemble empirical model decomposition with adaptive noise (CEEMDAN). We integrate meteorological data with those two components, and employ deep-learning models to capture long- and short-term dependencies towards the target prediction period. In this way, we excessively exploit the features in historical data in predicting a ultra-short-term solar power production. Furthermore, as ultra-short-term prediction is vulnerable to local optima, we modify the optimization in our deep-learning training by penalizing long prediction intervals. Numerical experiments with diverse settings demonstrate that, compared to baseline models, the proposed method achieves improved generalization in data reconstruction and higher prediction accuracy for ultra-short-term solar power production.


A Bayesian Dynamical System Model of Joint Action and Interpersonal Coordination

arXiv.org Artificial Intelligence

Successful teamwork depends on interpersonal dynamics, the ways in which individuals coordinate, influence, and adapt to one another over time. Existing measures of interpersonal dynamics, such as CRQA, correlation, Granger causality, and transfer entropy, typically capture only a single dimension: either the synchrony/coordination or the direction of influence between individuals. What is missing is a psychologically meaningful representation that unifies these dimensions and varies systematically with behavior. We propose the "context matrix" as one such representation. The context matrix, modeled within a linear dynamical system, has psychologically interpretable entries specifying how much each individual's current behavior is attributable to their own versus every other group member's past behaviors. Critically, these entries can be distilled into summary features that represent synchrony and directional influence. Evidence for the context matrix as psychologically meaningful is provided in two steps. First, we develop a sequential Bayesian model that infers context matrices from timeseries data and show that it accurately recovers them in noisy simulations. Second, applying the model to human eyetracking data, we demonstrate that summary features of the inferred context matrices capture expected task-based differences in interpersonal dynamics (or lack thereof), predict task accuracy in psychologically reasonable ways, and show some correspondence with existing measures (CRQA and Granger causality). We conclude by situating the context matrix within a broader agenda for modeling interpersonal dynamics in joint action.


Measuring Scalar Constructs in Social Science with LLMs

arXiv.org Artificial Intelligence

Many constructs that characterize language, like its complexity or emotionality, have a naturally continuous semantic structure; a public speech is not just "simple" or "complex," but exists on a continuum between extremes. Although large language models (LLMs) are an attractive tool for measuring scalar constructs, their idiosyncratic treatment of numerical outputs raises questions of how to best apply them. We address these questions with a comprehensive evaluation of LLM-based approaches to scalar construct measurement in social science. Using multiple datasets sourced from the political science literature, we evaluate four approaches: unweighted direct pointwise scoring, aggregation of pairwise comparisons, token-probability-weighted pointwise scoring, and finetuning. Our study finds that pairwise comparisons made by LLMs produce better measurements than simply prompting the LLM to directly output the scores, which suffers from bunching around arbitrary numbers. However, taking the weighted mean over the token probability of scores further improves the measurements over the two previous approaches. Finally, finetuning smaller models with as few as 1,000 training pairs can match or exceed the performance of prompted LLMs.


Advancing Knowledge Tracing by Exploring Follow-up Performance Trends

arXiv.org Artificial Intelligence

Intelligent Tutoring Systems (ITS), such as Massive Open Online Courses, offer new opportunities for human learning. At the core of such systems, knowledge tracing (KT) predicts students' future performance by analyzing their historical learning activities, enabling an accurate evaluation of students' knowledge states over time. We show that existing KT methods often encounter correlation conflicts when analyzing the relationships between historical learning sequences and future performance. To address such conflicts, we propose to extract so-called Follow-up Performance Trends (FPTs) from historical ITS data and to incorporate them into KT. We propose a method called Forward-Looking Knowledge Tracing (FINER) that combines historical learning sequences with FPTs to enhance student performance prediction accuracy. FINER constructs learning patterns that facilitate the retrieval of FPTs from historical ITS data in linear time; FINER includes a novel similarity-aware attention mechanism that aggregates FPTs based on both frequency and contextual similarity; and FINER offers means of combining FPTs and historical learning sequences to enable more accurate prediction of student future performance. Experiments on six real-world datasets show that FINER can outperform ten state-of-the-art KT methods, increasing accuracy by 8.74% to 84.85%.