Bayesian Inference
Split Gibbs Discrete Diffusion Posterior Sampling
Chu, Wenda, Song, Yang, Yue, Yisong
We study the problem of posterior sampling in discrete-state spaces using discrete diffusion models. While posterior sampling methods for continuous diffusion models have achieved remarkable progress, analogous methods for discrete diffusion models remain challenging. In this work, we introduce a principled plug-and-play discrete diffusion posterior sampling algorithm based on split Gibbs sampling, which we call SG-DPS. Our algorithm enables reward-guided generation and solving inverse problems in discrete-state spaces. We demonstrate that SG-DPS converges to the true posterior distribution on synthetic benchmarks, and enjoys state-of-the-art posterior sampling performance on a range of benchmarks for discrete data, achieving up to 2x improved performance compared to existing baselines.
Direct Discriminative Optimization: Your Likelihood-Based Visual Generative Model is Secretly a GAN Discriminator
Zheng, Kaiwen, Chen, Yongxin, Chen, Huayu, He, Guande, Liu, Ming-Yu, Zhu, Jun, Zhang, Qinsheng
While likelihood-based generative models, particularly diffusion and autoregressive models, have achieved remarkable fidelity in visual generation, the maximum likelihood estimation (MLE) objective inherently suffers from a mode-covering tendency that limits the generation quality under limited model capacity. In this work, we propose Direct Discriminative Optimization (DDO) as a unified framework that bridges likelihood-based generative training and the GAN objective to bypass this fundamental constraint. Our key insight is to parameterize a discriminator implicitly using the likelihood ratio between a learnable target model and a fixed reference model, drawing parallels with the philosophy of Direct Preference Optimization (DPO). Unlike GANs, this parameterization eliminates the need for joint training of generator and discriminator networks, allowing for direct, efficient, and effective finetuning of a well-trained model to its full potential beyond the limits of MLE. DDO can be performed iteratively in a self-play manner for progressive model refinement, with each round requiring less than 1% of pretraining epochs. Our experiments demonstrate the effectiveness of DDO by significantly advancing the previous SOTA diffusion model EDM, reducing FID scores from 1.79/1.58 to new records of 1.30/0.97 on CIFAR-10/ImageNet-64 datasets, and by consistently improving both guidance-free and CFG-enhanced FIDs of visual autoregressive models on ImageNet 256$\times$256.
Data-Efficient Kernel Methods for Learning Differential Equations and Their Solution Operators: Algorithms and Error Analysis
Jalalian, Yasamin, Ramirez, Juan Felipe Osorio, Hsu, Alexander, Hosseini, Bamdad, Owhadi, Houman
We introduce a novel kernel-based framework for learning differential equations and their solution maps that is efficient in data requirements, in terms of solution examples and amount of measurements from each example, and computational cost, in terms of training procedures. Our approach is mathematically interpretable and backed by rigorous theoretical guarantees in the form of quantitative worst-case error bounds for the learned equation. Numerical benchmarks demonstrate significant improvements in computational complexity and robustness while achieving one to two orders of magnitude improvements in terms of accuracy compared to state-of-the-art algorithms. Significance statement We present a novel algorithm inspired by kernel methods and Gaussian processes for learning differential equations and their solution operators in scarce data regimes. Our approach: (a) is significantly more efficient than state-of-the-art methods, including neural networks, in terms of required data and computational time. In fact, we obtain one to two orders of magnitude improvement in accuracy on a number of benchmarks; (b) is supported by rigorous theory featuring the first quantitative worst-case error bounds for equation learning; and (c) can solve previously intractable scientific computing problems such as one-shot operator learning and learning of variable-coefficient PDEs in extremely scarce data regimes.
Bayesian Active Learning for Multi-Criteria Comparative Judgement in Educational Assessment
Gray, Andy, Rahat, Alma, Crick, Tom, Lindsay, Stephen
Comparative Judgement (CJ) provides an alternative assessment approach by evaluating work holistically rather than breaking it into discrete criteria. This method leverages human ability to make nuanced comparisons, yielding more reliable and valid assessments. CJ aligns with real-world evaluations, where overall quality emerges from the interplay of various elements. However, rubrics remain widely used in education, offering structured criteria for grading and detailed feedback. This creates a gap between CJ's holistic ranking and the need for criterion-based performance breakdowns. This paper addresses this gap using a Bayesian approach. We build on Bayesian CJ (BCJ) by Gray et al., which directly models preferences instead of using likelihoods over total scores, allowing for expected ranks with uncertainty estimation. Their entropy-based active learning method selects the most informative pairwise comparisons for assessors. We extend BCJ to handle multiple independent learning outcome (LO) components, defined by a rubric, enabling both holistic and component-wise predictive rankings with uncertainty estimates. Additionally, we propose a method to aggregate entropies and identify the most informative comparison for assessors. Experiments on synthetic and real data demonstrate our method's effectiveness. Finally, we address a key limitation of BCJ, which is the inability to quantify assessor agreement. We show how to derive agreement levels, enhancing transparency in assessment.
A Guide to Failure in Machine Learning: Reliability and Robustness from Foundations to Practice
Heim, Eric, Wright, Oren, Shriver, David
One of the main barriers to adoption of Machine Learning (ML) is that ML models can fail unexpectedly. In this work, we aim to provide practitioners a guide to better understand why ML models fail and equip them with techniques they can use to reason about failure. Specifically, we discuss failure as either being caused by lack of reliability or lack of robustness. Differentiating the causes of failure in this way allows us to formally define why models fail from first principles and tie these definitions to engineering concepts and real-world deployment settings. Throughout the document we provide 1) a summary of important theoretic concepts in reliability and robustness, 2) a sampling current techniques that practitioners can utilize to reason about ML model reliability and robustness, and 3) examples that show how these concepts and techniques can apply to real-world settings.
Towards Hierarchical Rectified Flow
Zhang, Yichi, Yan, Yici, Schwing, Alex, Zhao, Zhizhen
Published as a conference paper at ICLR 2025T OWARDSH IERARCHICAL R ECTIFIED F LOW Yichi Zhang 1, Yici Y an 1, Alex Schwing 1, Zhizhen Zhao 1 1 University of Illinois Urbana-Champaign A BSTRACT We formulate a hierarchical rectified flow to model data distributions. It hierarchically couples multiple ordinary differential equations (ODEs) and defines a time-differentiable stochastic process that generates a data distribution from a known source distribution. Each ODE resembles the ODE that is solved in a classic rectified flow, but differs in its domain, i.e., location, velocity, acceleration, etc. Unlike the classic rectified flow formulation, which formulates a single ODE in the location domain and only captures the expected velocity field (sufficient to capture a multi-modal data distribution), the hierarchical rectified flow formulation models the multi-modal random velocity field, acceleration field, etc., in their entirety. This more faithful modeling of the random velocity field enables integration paths to intersect when the underlying ODE is solved during data generation. Intersecting paths in turn lead to integration trajectories that are more straight than those obtained in the classic rectified flow formulation, where integration paths cannot intersect. This leads to modeling of data distributions with fewer neural function evaluations. We empirically verify this on synthetic 1D and 2D data as well as MNIST, CIFAR-10, and ImageNet-32 data. Our code is available at: https://riccizz.github.io/HRF/ . 1 I NTRODUCTION Diffusion models (Ho et al., 2020; Song et al., 2021a;b) and particularly also flow matching (Liu et al., 2023; Lipman et al., 2023; Albergo & V anden-Eijnden, 2023; Albergo et al., 2023) have gained significant attention recently. This is partly due to impressive results that have been reported across domains from computer vision (Ho et al., 2020) and medical imaging (Song et al., 2022) to robotics (Kapelyukh et al., 2023) and computational biology (Guo et al., 2024). Beyond impressive results, flow matching was also reported to faithfully model multimodal data distributions. In addition, sampling is reasonably straightforward: it requires to solve an ordinary differential equation (ODE) via forward integration of a set of source distribution points along an estimated velocity field from time zero to time one. The source distribution points are sampled from a simple and known source distribution, e.g., a standard Gaussian. The velocity field is obtained by matching velocities from a constructed "ground-truth" integration path with a parametric deep net using a mean squared error (MSE) objective. See Figure 1(a) for the "ground-truth" integration paths of classic rectified flow. Studying the "ground-truth" velocity distribution at a distinct location and time for rectified flow reveals a multimodal distribution.
The Uncertainty of Machine Learning Predictions in Asset Pricing
Liao, Yuan, Ma, Xinjie, Neuhierl, Andreas, Schilling, Linda
Recently, machine learning (ML) models have gained prominence in predicting asset returns, selecting portfolios, and estimating stochastic discount factors, with significant success in these areas. ML techniques, by capturing complex and nonlinear relationships in financial data, are particularly well-suited for enhancing portfolio management decisions. For example, within the mean-variance portfolio framework, ML methods are increasingly used to estimate expected returns and (co)variances, often leading to more effective portfolio allocations. The literature consistently demonstrates the effectiveness of machine learning in these and other applications (e.g., Gu, Kelly, and Xiu (2020); Bianchi, B uchner, and Tamoni (2021); Cong, Tang, Wang, and Zhang (2021); Kelly, Malamud, and Zhou (2021); Patton and Weller (2022); Didisheim, Ke, Kelly, and Malamud (2023); Filipovic and Schneider (2024)). Despite the success of machine learning in asset pricing, existing literature typically treats ML predictions as point estimates and conducts asset pricing analyses as if they were true values, overlooking the associated uncertainty. This is surprising, given that uncertainty about input parameters is widely acknowledged as critical in portfolio selection (e.g., DeMiguel, Garlappi, and Uppal (2009)), and Garlappi, Uppal, and Wang (2007) show that incorporating forecast uncertainty in mean-variance portfolio allocation leads to distinct economic insights. However, quantifying prediction uncertainty in ML forecasts, particularly with neural networks, remains a complex challenge, limiting their broader application in asset pricing.
Evidence of Replica Symmetry Breaking under the Nishimori conditions in epidemic inference on graphs
Braunstein, Alfredo, Budzynski, Louise, Mariani, Matteo, Ricci-Tersenghi, Federico
In Bayesian inference, computing the posterior distribution from the data is typically a non-trivial problem, which usually requires approximations such as mean-field approaches or numerical methods, like the Monte Carlo Markov Chain. Being a high-dimensional distribution over a set of correlated variables, the posterior distribution can undergo the notorious replica symmetry breaking transition. When it happens, several mean-field methods and virtually every Monte Carlo scheme can not provide a reasonable approximation to the posterior and its marginals. Replica symmetry is believed to be guaranteed whenever the data is generated with known prior and likelihood distributions, namely under the so-called Nishimori conditions. In this paper, we break this belief, by providing a counter-example showing that, under the Nishimori conditions, replica symmetry breaking arises. Introducing a simple, geometrical model that can be thought of as a patient zero retrieval problem in a highly infectious regime of the epidemic Susceptible-Infectious model, we show that under the Nishimori conditions, there is evidence of replica symmetry breaking. We achieve this result by computing the instability of the replica symmetric cavity method toward the one step replica symmetry broken phase. The origin of this phenomenon -- replica symmetry breaking under the Nishimori conditions -- is likely due to the correlated disorder appearing in the epidemic models.
Human-AI Collaboration: Trade-offs Between Performance and Preferences
Mayer, Lukas William, Karny, Sheer, Ayoub, Jackie, Song, Miao, Tian, Danyang, Moradi-Pari, Ehsan, Steyvers, Mark
Despite the growing interest in collaborative AI, designing systems that seamlessly integrate human input remains a major challenge. In this study, we developed a task to systematically examine human preferences for collaborative agents. We created and evaluated five collaborative AI agents with strategies that differ in the manner and degree they adapt to human actions. Participants interacted with a subset of these agents, evaluated their perceived traits, and selected their preferred agent. We used a Bayesian model to understand how agents' strategies influence the Human-AI team performance, AI's perceived traits, and the factors shaping human-preferences in pairwise agent comparisons. Our results show that agents who are more considerate of human actions are preferred over purely performance-maximizing agents. Moreover, we show that such human-centric design can improve the likability of AI collaborators without reducing performance. We find evidence for inequality-aversion effects being a driver of human choices, suggesting that people prefer collaborative agents which allow them to meaningfully contribute to the team. Taken together, these findings demonstrate how collaboration with AI can benefit from development efforts which include both subjective and objective metrics.
Generative Uncertainty in Diffusion Models
Jazbec, Metod, Wong-Toi, Eliot, Xia, Guoxuan, Zhang, Dan, Nalisnick, Eric, Mandt, Stephan
Diffusion models have recently driven significant breakthroughs in generative modeling. While state-of-the-art models produce high-quality samples on average, individual samples can still be low quality. Detecting such samples without human inspection remains a challenging task. To address this, we propose a Bayesian framework for estimating generative uncertainty of synthetic samples. We outline how to make Bayesian inference practical for large, modern generative models and introduce a new semantic likelihood (evaluated in the latent space of a feature extractor) to address the challenges posed by high-dimensional sample spaces. Through our experiments, we demonstrate that the proposed generative uncertainty effectively identifies poor-quality samples and significantly outperforms existing uncertainty-based methods. Notably, our Bayesian framework can be applied post-hoc to any pretrained diffusion or flow matching model (via the Laplace approximation), and we propose simple yet effective techniques to minimize its computational overhead during sampling.