Goto

Collaborating Authors

 Bayesian Learning


Curiosity-Driven LLM-as-a-judge for Personalized Creative Judgment

arXiv.org Artificial Intelligence

Creative Thinking(TTCW) benchmark introduced in Chakrabarty et al. (2024), Rigorous, standardized evaluation has repeatedly catalyzed progress in machine learning, from ImageNetRussakovsky et al. (2015) and GLUEWang et al. (2019), driving leaps in the fields of computer vision and Natural Language Processing, respectively. The same effect is evident in objective math reasoning, where benchmarks like GSM8KCobbe et al. (2021), together with RL-trained reasoning models such as OpenAI's o1OpenAI et al. (2024) and DeepSeek-R1DeepSeek-AI Models(LLM) as a judge prefer their own generations making them unreliable. As shown in Chakrabarty et al. (2024) and Table 12 and Table 2, even Specifically, when the model is "surprised" by an expert's explanation, it signals a mismatch between the LLM's prior belief and the expert's The intuition behind predicting the annotator is that the model can learn which annotator caused the belief shift, allowing it to calibrate the curiosity signal for each annotator individually, thereby improving personalization. In our experiments, we establish a baseline using an SFT model that predicts annotators' binary More details about the results can be found in Fig 4.Figure 1: Overview of Architecture during training for Curiosity Driven LLM-as-a-judgeFigure 2: Overview of Architecture during inference for Curiosity Driven LLM-as-a-judge 2 (a) Baseline without using explanations (b) Baseline using explanations TTCW dataset Chakrabarty et al. (2024), which is based on the Torrance Test of Creative Thinking Torrance (1966) but adapted for LLMs. All the distinct dimensions in the TTCW dataset are mentioned in Appendix A.1.


Risk Profiling and Modulation for LLMs

arXiv.org Artificial Intelligence

Large language models (LLMs) are increasingly used for decision-making tasks under uncertainty; however, their risk profiles and how they are influenced by prompting and alignment methods remain underexplored. Existing studies have primarily examined personality prompting or multi-agent interactions, leaving open the question of how post-training influences the risk behavior of LLMs. In this work, we propose a new pipeline for eliciting, steering, and modulating LLMs' risk profiles, drawing on tools from behavioral economics and finance. Using utility-theoretic models, we compare pre-trained, instruction-tuned, and RLHF-aligned LLMs, and find that while instruction-tuned models exhibit behaviors consistent with some standard utility formulations, pre-trained and RLHF-aligned models deviate more from any utility models fitted. We further evaluate modulation strategies, including prompt engineering, in-context learning, and post-training, and show that post-training provides the most stable and effective modulation of risk preference. Our findings provide insights into the risk profiles of different classes and stages of LLMs and demonstrate how post-training modulates these profiles, laying the groundwork for future research on behavioral alignment and risk-aware LLM design.


Understanding Catastrophic Interference: On the Identifibility of Latent Representations

arXiv.org Artificial Intelligence

Catastrophic interference, also known as catastrophic forgetting, is a fundamental challenge in machine learning, where a trained learning model progressively loses performance on previously learned tasks when adapting to new ones. In this paper, we aim to better understand and model the catastrophic interference problem from a latent representation learning point of view, and propose a novel theoretical framework that formulates catastrophic interference as an identification problem. Our analysis demonstrates that the forgetting phenomenon can be quantified by the distance between partial-task aware (PTA) and all-task aware (ATA) setups. Building upon recent advances in identifiability theory, we prove that this distance can be minimized through identification of shared latent variables between these setups. When learning, we propose our method \ourmeos with two-stage training strategy: First, we employ maximum likelihood estimation to learn the latent representations from both PTA and ATA configurations. Subsequently, we optimize the KL divergence to identify and learn the shared latent variables. Through theoretical guarantee and empirical validations, we establish that identifying and learning these shared representations can effectively mitigate catastrophic interference in machine learning systems. Our approach provides both theoretical guarantees and practical performance improvements across both synthetic and benchmark datasets.


Sparse Representations Improve Adversarial Robustness of Neural Network Classifiers

arXiv.org Artificial Intelligence

Deep neural networks perform remarkably well on image classification tasks but remain vulnerable to carefully crafted adversarial perturbations. This work revisits linear dimensionality reduction as a simple, data-adapted defense. We empirically compare standard Principal Component Analysis (PCA) with its sparse variant (SPCA) as front-end feature extractors for downstream classifiers, and we complement these experiments with a theoretical analysis. On the theory side, we derive exact robustness certificates for linear heads applied to SPCA features: for both $\ell_\infty$ and $\ell_2$ threat models (binary and multiclass), the certified radius grows as the dual norms of $W^\top u$ shrink, where $W$ is the projection and $u$ the head weights. We further show that for general (non-linear) heads, sparsity reduces operator-norm bounds through a Lipschitz composition argument, predicting lower input sensitivity. Empirically, with a small non-linear network after the projection, SPCA consistently degrades more gracefully than PCA under strong white-box and black-box attacks while maintaining competitive clean accuracy. Taken together, the theory identifies the mechanism (sparser projections reduce adversarial leverage) and the experiments verify that this benefit persists beyond the linear setting. Our code is available at https://github.com/killian31/SPCARobustness.


Attribute Fusion-based Classifier on Framework of Belief Structure

arXiv.org Artificial Intelligence

Abstract--Dempster-Shafer Theory (DST) provides a powerful framework for modeling uncertainty and has been widely applied to multi-attribute classification tasks. However, traditional DST - based attribute fusion-based classifiers suffer from oversimplified membership function modeling and limited exploitation of the belief structure brought by basic probability assignment (BPA), reducing their effectiveness in complex real-world scenarios. This paper presents an enhanced attribute fusion-based classifier that addresses these limitations through two key innovations. First, we adopt a selective modeling strategy that utilizes both single Gaussian and Gaussian Mixture Models (GMMs) for membership function construction, with model selection guided by cross-validation and a tailored evaluation metric. Second, we introduce a novel method to transform the possibility distribution into a BPA by combining simple BPAs derived from normalized possibility distributions, enabling a much richer and more flexible representation of uncertain information. Furthermore, we apply the belief structure-based BPA generation method to the evidential K-Nearest Neighbors (EKNN) classifier, enhancing its ability to incorporate uncertainty information into decision-making. Comprehensive experiments on benchmark datasets are conducted to evaluate the performance of the proposed attribute fusion-based classifier and the enhanced evidential K-Nearest Neighbors classifier in comparison with both evidential classifiers and conventional machine learning classifiers. The results demonstrate that the proposed classifier outperforms the best existing evidential classifier, achieving an average accuracy improvement of 4.86%, while maintaining low variance, thus confirming its superior effectiveness and robustness.


Embracing Discrete Search: A Reasonable Approach to Causal Structure Learning

arXiv.org Machine Learning

Learning about the directed acyclic graph (DAG) underlying a system's data-generating process from observational data under causal sufficiency is a fundamental causal discovery task (Pearl, 2009). Score-based algorithms address this task by assigning penalized likelihood scores to each DAG and seeking graphs whose scores are optimal. Identifiability theory asks when such score-optimal graphs identify the target graph (or its equivalence class) in the infinite-sample limit, with various results under different assumptions and scores (Chickering, 2002; Nandy et al., 2018). Exact algorithms, that are guaranteed to find a score-optimal graph, have exponential run-time and are feasible up to roughly 30 variables (Koivisto & Sood, 2004; Silander & Myllym aki, 2006). For larger graphs, local search must be employed, which evaluates neighbouring graphs to find graphs with better scores; canonical moves for this hill climbing are single edge insertions, deletions, or reversals (Heckerman et al., 1995). In the sample limit, greedy discrete search with a neighbourhood notion that respects score equivalence provably finds a graph with optimal score (Chickering, 2002). In finite samples, scores are inexact and local search may get stuck in local optima or, as we demonstrate, even find graphs with better scores than the true graph. Finite-sample performance is a practical challenge, despite the mature identifiability theory and asymptotic guarantees. Continuous optimization methods have emerged as a popular alternative.


SONA: Learning Conditional, Unconditional, and Mismatching-Aware Discriminator

arXiv.org Machine Learning

Deep generative models have made significant advances in generating complex content, yet conditional generation remains a fundamental challenge. Existing conditional generative adversarial networks often struggle to balance the dual objectives of assessing authenticity and conditional alignment of input samples within their conditional discriminators. To address this, we propose a novel discriminator design that integrates three key capabilities: unconditional discrimination, matching-aware supervision to enhance alignment sensitivity, and adaptive weighting to dynamically balance all objectives. Specifically, we introduce Sum of Naturalness and Alignment (SONA), which employs separate projections for naturalness (authenticity) and alignment in the final layer with an inductive bias, supported by dedicated objective functions and an adaptive weighting mechanism. Extensive experiments on class-conditional generation tasks show that \ours achieves superior sample quality and conditional alignment compared to state-of-the-art methods. Furthermore, we demonstrate its effectiveness in text-to-image generation, confirming the versatility and robustness of our approach.


Don't Pass$\mathtt{@}k$: A Bayesian Framework for Large Language Model Evaluation

arXiv.org Machine Learning

Pass$@k$ is widely used to report performance for LLM reasoning, but it often yields unstable, misleading rankings, especially when the number of trials (samples) is limited and compute is constrained. We present a principled Bayesian evaluation framework that replaces Pass$@k$ and average accuracy over $N$ trials (avg$@N$) with posterior estimates of a model's underlying success probability and credible intervals, yielding stable rankings and a transparent decision rule for differences. Evaluation outcomes are modeled as categorical (not just 0/1) with a Dirichlet prior, giving closed-form expressions for the posterior mean and uncertainty of any weighted rubric and enabling the use of prior evidence when appropriate. Theoretically, under a uniform prior, the Bayesian posterior mean is order-equivalent to average accuracy (Pass$@1$), explaining its empirical robustness while adding principled uncertainty. Empirically, in simulations with known ground-truth success rates and on AIME'24/'25, HMMT'25, and BrUMO'25, the Bayesian/avg procedure achieves faster convergence and greater rank stability than Pass$@k$ and recent variants, enabling reliable comparisons at far smaller sample counts. The framework clarifies when observed gaps are statistically meaningful (non-overlapping credible intervals) versus noise, and it naturally extends to graded, rubric-based evaluations. Together, these results recommend replacing Pass$@k$ for LLM evaluation and ranking with a posterior-based, compute-efficient protocol that unifies binary and non-binary evaluation while making uncertainty explicit. Code is available at https://mohsenhariri.github.io/bayes-kit


Simulation-based inference via telescoping ratio estimation for trawl processes

arXiv.org Machine Learning

The growing availability of large and complex datasets has increased interest in temporal stochastic processes that can capture stylized facts such as marginal skewness, non-Gaussian tails, long memory, and even non-Markovian dynamics. While such models are often easy to simulate from, parameter estimation remains challenging. Simulation-based inference (SBI) offers a promising way forward, but existing methods typically require large training datasets or complex architectures and frequently yield confidence (credible) regions that fail to attain their nominal values, raising doubts on the reliability of estimates for the very features that motivate the use of these models. To address these challenges, we propose a fast and accurate, sample-efficient SBI framework for amortized posterior inference applicable to intractable stochastic processes. The proposed approach relies on two main steps: first, we learn the posterior density by decomposing it sequentially across parameter dimensions. Then, we use Chebyshev polynomial approximations to efficiently generate independent posterior samples, enabling accurate inference even when Markov chain Monte Carlo methods mix poorly. We further develop novel diagnostic tools for SBI in this context, as well as post-hoc calibration techniques; the latter not only lead to performance improvements of the learned inferential tool, but also to the ability to reuse it directly with new time series of varying lengths, thus amortizing the training cost. We demonstrate the method's effectiveness on trawl processes, a class of flexible infinitely divisible models that generalize univariate Gaussian processes, applied to energy demand data.


Neural Bayesian Filtering

arXiv.org Machine Learning

As an example, consider the problem of tracking an autonomous robot with an unknown starting position in a d d grid (Figure 1). Suppose the agent's policy is known, and an observer sees that the agent moved a step without colliding into a wall. This information indicates how the observer should update their beliefs about the agent's position. Tracking these belief states can be challenging when they are either continuous or too large to enumerate (Solinas et al., 2023)--even when the agent's policy and the environment dynamics are known. A common approach frames belief state modeling as a Bayesian filtering problem in which a posterior is maintained and updated with each new observation. Classical Bayesian filters, such as the Kalman Filter (Kalman, 1960) and its nonlinear variants (e.g., Extended and Unscented Kalman Filters (Sorenson, 1985; Julier & Uhlmann, 2004)), assume that the underlying distributions are unimodal and approximately Gaussian. While computationally efficient, this limits their applicability in settings that do not satisfy these assumptions.