asm
Overfitting and Generalizing with (PAC) Bayesian Prediction in Noisy Binary Classification
Zhu, Xiaohan, Ohannessian, Mesrob I., Srebro, Nathan
We consider a PAC-Bayes type learning rule for binary classification, balancing the training error of a randomized ''posterior'' predictor with its KL divergence to a pre-specified ''prior''. This can be seen as an extension of a modified two-part-code Minimum Description Length (MDL) learning rule, to continuous priors and randomized predictions. With a balancing parameter of $λ=1$ this learning rule recovers an (empirical) Bayes posterior and a modified variant recovers the profile posterior, linking with standard Bayesian prediction (up to the treatment of the single-parameter noise level). However, from a risk-minimization prediction perspective, this Bayesian predictor overfits and can lead to non-vanishing excess loss in the agnostic case. Instead a choice of $λ\gg 1$, which can be seen as using a sample-size-dependent-prior, ensures uniformly vanishing excess loss even in the agnostic case. We precisely characterize the effect of under-regularizing (and over-regularizing) as a function of the balance parameter $λ$, understanding the regimes in which this under-regularization is tempered or catastrophic. This work extends previous work by Zhu and Srebro [2025] that considered only discrete priors to PAC Bayes type learning rules and, through their rigorous Bayesian interpretation, to Bayesian prediction more generally.
Multi-Domain Empirical Bayes for Linearly-Mixed Causal Representations
Wu, Bohan, von Kügelgen, Julius, Blei, David M.
Causal representation learning (CRL) aims to learn low-dimensional causal latent variables from high-dimensional observations. While identifiability has been extensively studied for CRL, estimation has been less explored. In this paper, we explore the use of empirical Bayes (EB) to estimate causal representations. In particular, we consider the problem of learning from data from multiple domains, where differences between domains are modeled by interventions in a shared underlying causal model. Multi-domain CRL naturally poses a simultaneous inference problem that EB is designed to tackle. Here, we propose an EB $f$-modeling algorithm that improves the quality of learned causal variables by exploiting invariant structure within and across domains. Specifically, we consider a linear measurement model and interventional priors arising from a shared acyclic SCM. When the graph and intervention targets are known, we develop an EM-style algorithm based on causally structured score matching. We further discuss EB $g$-modeling in the context of existing CRL approaches. In experiments on synthetic data, our proposed method achieves more accurate estimation than other methods for CRL.
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
- Europe > Switzerland (0.04)
- North America > United States > New York (0.04)
- Asia > Middle East > Jordan (0.04)
- North America > Canada (0.04)
- Asia > Singapore (0.04)
AdversarialStyleMiningforOne-Shot Unsupervised DomainAdaptation
Theintroduction ofDomainAdaptation (DA)techniquesaims to mitigate such performance drop when a trained agent encounters a different environment. By bridging the distribution gap between source and target domains, DA methods have shown their effect in many cross-domain tasks such as classification [27, 18], segmentation [19, 22, 23] and detection[3].
- North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
- Asia > Singapore (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Asia > Middle East > Jordan (0.04)
- Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.65)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.48)
- North America > United States > Virginia > Arlington County > Arlington (0.04)
- Europe > United Kingdom (0.04)
- Research Report > Strength High (1.00)
- Research Report > Experimental Study (1.00)
A Hybrid Autoencoder-Transformer Model for Robust Day-Ahead Electricity Price Forecasting under Extreme Conditions
Tang, Boyan, Ren, Xuanhao, Xiao, Peng, Lei, Shunbo, Sun, Xiaorong, Wu, Jianghua
Abstract--Accurate day-ahead electricity price forecasting (DAEPF) is critical for the efficient operation of power systems, but extreme condition and market anomalies pose significant challenges to existing forecasting methods. T o overcome these challenges, this paper proposes a novel hybrid deep learning framework that integrates a Distilled Attention Transformer (DA T) model and an Autoencoder Self-regression Model (ASM). The DA T leverages a self-attention mechanism to dynamically assign higher weights to critical segments of historical data, effectively capturing both long-term trends and short-term fluctuations. Concurrently, the ASM employs unsupervised learning to detect and isolate anomalous patterns induced by extreme conditions, such as heavy rain, heat waves, or human festivals. Experiments on datasets sampled from California and Shandong Province demonstrate that our framework significantly outperforms state-of-the-art methods in prediction accuracy, robustness, and computational efficiency. Our framework thus holds promise for enhancing grid resilience and optimizing market operations in future power systems. Day-ahead electricity price forecasting (DAEPF) is vital to modern power system operations, providing important information for generators, market operators, and consumers.
- Asia > China > Shandong Province (0.34)
- North America > United States > California (0.25)
- North America > Trinidad and Tobago > Trinidad > Arima > Arima (0.05)
- (5 more...)