ppd
Transformers Can Learn Posterior Predictive Distributions In-Context
Kang, Gyeonghun, Lee, Changwoo J., Cheng, Xiang
Prior-data fitted networks (PFNs) have recently emerged as a powerful approach for Bayesian prediction tasks, approximating the posterior predictive distribution (PPD) through in-context learning. Despite their strong empirical performance and ability to go beyond point predictions, theoretical understandings of the algorithmic capability of transformers to learn distributions in context are still lacking. Focusing on Gaussian process regression problems, we show by construction that transformers can implement a gradient descent algorithm targeting the posterior predictive mean and variance, followed by nonlinear mappings that yield binned probabilities of PPD. We study the error bounds of the approximated PPD in terms of attention depth and bin resolution. Based on these results, we further demonstrate the key role of normalization and the choice of attention depth in enabling the extrapolation abilities of transformers beyond the pretraining sample size range. We conduct simulations that corroborate our findings, providing insight into the expressivity of PFNs targeting PPDs and how architectural choices may influence generalization capabilities.
Tuning Derivatives for Causal Fairness in Machine Learning
Edstrรถm, Filip, Barros, Guilherme W. F., Gorbach, Tetiana, de Luna, Xavier
Artificial-intelligence systems are becoming ubiquitous in society, yet their predictions typically inherit biases with respect to protected attributes such as race, gender, or age. Classical fairness notions, most notably Statistical Parity (SP), demand that predictions be independent of the protected attributes, but are overly restrictive when these attributes influence mediating variables that are considered business necessities. Recent causal formulations relax SP by distinguishing allowed from not-allowed causal paths and by complementing SP with Predictive Parity (PP), requiring the predictor to replicate the legitimate influence of business-necessities. Existing path-based definitions are mainly practical when applied to categorical attributes. This paper introduces a new framework for fairness in structural causal models that is tailored to continuous protected attributes. We formalize SP and PP through path-specific partial derivatives, establish conditions under which these criteria coincide with prior causal definitions, and characterize when a fair predictor, one that satisfies SP along not-allowed paths while achieving PP along allowed paths, exists. Building on this theory, we propose a fair tuning algorithm that either constructs such a predictor or, when not possible, allows for a trade-off between SP and PP. We present experiments on simulated and real data to evaluate our proposal, compare it with previously proposed methods, and show that it performs better when PP is considered.
Eliciting Categorical Data for Optimal Aggregation
Chien-Ju Ho, Rafael Frongillo, Yiling Chen
Models for collecting and aggregating categorical data on crowdsourcing platforms typically fall into two broad categories: those assuming agents honest and consistent but with heterogeneous error rates, and those assuming agents strategic and seek to maximize their expected reward. The former often leads to tractable aggregation of elicited data, while the latter usually focuses on optimal elicitation and does not consider aggregation. In this paper, we develop a Bayesian model, wherein agents have differing quality of information, but also respond to incentives. Our model generalizes both categories and enables the joint exploration of optimal elicitation and aggregation. This model enables our exploration, both analytically and experimentally, of optimal aggregation of categorical data and optimal multiple-choice interface design.
Your eyes can only handle so much HDTV
More pixels doesn't always mean a better screen. Breakthroughs, discoveries, and DIY tips sent every weekday. Every year, tech and television companies boast their products' latest and greatest, highest-resolution displays. The 4K display--a screen with a horizontal display of approximately 4,000 pixels-- first became widely available around 2014. Barely a decade later, you can purchase a TV with double the resolution .
Detecting and explaining postpartum depression in real-time with generative artificial intelligence
Garcรญa-Mรฉndez, Silvia, de Arriba-Pรฉrez, Francisco
Among the many challenges mothers undergo after childbirth, postpartum depression ( ppd) is a severe condition that significantly impacts their mental and physical well-being. Consequently, the rapid detection of ppd and their associated risk factors is critical for in-time assessment and intervention through specialized prevention procedures. Accordingly, this work addresses the need to help practitioners make decisions with the latest technological advancements to enable real-time screening and treatment recommendations. Mainly, our work contributes to an intelligent ppd screening system that combines Natural Language Processing, Machine Learning ( ml), and Large Language Models ( llm s) towards an affordable, real-time, and non-invasive free speech analysis. Moreover, it addresses the black box problem since the predictions are described to the end users thanks to the combination of llm s with interpretable ml models ( i.e., tree-based algorithms) using feature importance and natural language. The results obtained are 90 % on ppd detection for all evaluation metrics, outperforming the competing solutions in the literature. Ultimately, our solution contributes to the rapid detection of ppd and their associated risk factors, critical for in-time and proper assessment and intervention. Introduction Depression is a global public health concern that affects more than 150 million people, being more prevalent in women (Labaka et al., 2018; Moreira et al., 2019). Among the many challenges mothers undergo after childbirth, postpartum depression ( ppd) is a severe condition that usually requires medical intervention (Falana & Carrington, 2019). Mainly, ppd is a common non-psychotic mental disorder during the first year after childbirth that can lead to severe complications in the women's health (Abadiga, 2019). Current data indicates that between 10 % to 15 % of mothers worldwide are affected with ppd yearly (Fatima et al., 2019; Liu et al., 2023). Moreover, only 20% of the target population is diagnosed or even treated promptly (Mazumder & Baruah, 2021).
Topological Social Choice: Designing a Noise-Robust Polar Distance for Persistence Diagrams
Andrikopoulos, Athanasios, Sampanis, Nikolaos
Topological Data Analysis (TDA) has emerged as a powerful framework for extracting robust and interpretable features from noisy high-dimensional data. In the context of Social Choice Theory, where preference profiles and collective decisions are geometrically rich yet sensitive to perturbations, TDA remains largely unexplored. This work introduces a novel conceptual bridge between these domains by proposing a new metric framework for persistence diagrams tailored to noisy preference data.We define a polar coordinate-based distance that captures both the magnitude and orientation of topological features in a smooth and differentiable manner. Our metric addresses key limitations of classical distances, such as bottleneck and Wasserstein, including instability under perturbation, lack of continuity, and incompatibility with gradient-based learning. The resulting formulation offers improved behavior in both theoretical and applied settings.To the best of our knowledge, this is the first study to systematically apply persistent homology to social choice systems, providing a mathematically grounded method for comparing topological summaries of voting structures and preference dynamics. We demonstrate the superiority of our approach through extensive experiments, including robustness tests and supervised learning tasks, and we propose a modular pipeline for building predictive models from online preference data. This work contributes a conceptually novel and computationally effective tool to the emerging interface of topology and decision theory, opening new directions in interpretable machine learning for political and economic systems.
Evasion Attacks Against Bayesian Predictive Models
Arce, Pablo G., Naveiro, Roi, Insua, David Rรญos
There is an increasing interest in analyzing the behavior of machine learning systems against adversarial attacks. However, most of the research in adversarial machine learning has focused on studying weaknesses against evasion or poisoning attacks to predictive models in classical setups, with the susceptibility of Bayesian predictive models to attacks remaining underexplored. This paper introduces a general methodology for designing optimal evasion attacks against such models. We investigate two adversarial objectives: perturbing specific point predictions and altering the entire posterior predictive distribution. For both scenarios, we propose novel gradient-based attacks and study their implementation and properties in various computational setups.
Uncertainty Quantification for Prior-Data Fitted Networks using Martingale Posteriors
Nagler, Thomas, Rรผgamer, David
Prior-data fitted networks (PFNs) have emerged as promising foundation models for prediction from tabular data sets, achieving state-of-the-art performance on small to moderate data sizes without tuning. While PFNs are motivated by Bayesian ideas, they do not provide any uncertainty quantification for predictive means, quantiles, or similar quantities. We propose a principled and efficient sampling procedure to construct Bayesian posteriors for such estimates based on Martingale posteriors, and prove its convergence. Several simulated and real-world data examples showcase the uncertainty quantification of our method in inference applications.
Temperature Optimization for Bayesian Deep Learning
Ng, Kenyon, van der Heide, Chris, Hodgkinson, Liam, Wei, Susan
The Cold Posterior Effect (CPE) is a phenomenon in Bayesian Deep Learning (BDL), where tempering the posterior to a cold temperature often improves the predictive performance of the posterior predictive distribution (PPD). Although the term `CPE' suggests colder temperatures are inherently better, the BDL community increasingly recognizes that this is not always the case. Despite this, there remains no systematic method for finding the optimal temperature beyond grid search. In this work, we propose a data-driven approach to select the temperature that maximizes test log-predictive density, treating the temperature as a model parameter and estimating it directly from the data. We empirically demonstrate that our method performs comparably to grid search, at a fraction of the cost, across both regression and classification tasks. Finally, we highlight the differing perspectives on CPE between the BDL and Generalized Bayes communities: while the former primarily focuses on predictive performance of the PPD, the latter emphasizes calibrated uncertainty and robustness to model misspecification; these distinct objectives lead to different temperature preferences.