Asia
FeDMRA: Federated Incremental Learning with Dynamic Memory Replay Allocation
Wang, Tiantian, Xiang, Xiang, Du, Simon S.
In federated healthcare systems, Federated Class-Incremental Learning (FCIL) has emerged as a key paradigm, enabling continuous adaptive model learning among distributed clients while safeguarding data privacy. However, in practical applications, data across agent nodes within the distributed framework often exhibits non-independent and identically distributed (non-IID) characteristics, rendering traditional continual learning methods inapplicable. To address these challenges, this paper covers more comprehensive incremental task scenarios and proposes a dynamic memory allocation strategy for exemplar storage based on the data replay mechanism. This strategy fully taps into the inherent potential of data heterogeneity, while taking into account the performance fairness of all participating clients, thereby establishing a balanced and adaptive solution to mitigate catastrophic forgetting. Unlike the fixed allocation of client exemplar memory, the proposed scheme emphasizes the rational allocation of limited storage resources among clients to improve model performance. Furthermore, extensive experiments are conducted on three medical image datasets, and the results demonstrate significant performance improvements compared to existing baseline models.
A Perturbation Approach to Unconstrained Linear Bandits
Jacobsen, Andrew, Baudry, Dorian, Ito, Shinji, Cesa-Bianchi, Nicolรฒ
We revisit the standard perturbation-based approach of Abernethy et al. (2008) in the context of unconstrained Bandit Linear Optimization (uBLO). We show the surprising result that in the unconstrained setting, this approach effectively reduces Bandit Linear Optimization (BLO) to a standard Online Linear Optimization (OLO) problem. Our framework improves on prior work in several ways. First, we derive expected-regret guarantees when our perturbation scheme is combined with comparator-adaptive OLO algorithms, leading to new insights about the impact of different adversarial models on the resulting comparator-adaptive rates. We also extend our analysis to dynamic regret, obtaining the optimal $\sqrt{P_T}$ path-length dependencies without prior knowledge of $P_T$. We then develop the first high-probability guarantees for both static and dynamic regret in uBLO. Finally, we discuss lower bounds on the static regret, and prove the folklore $ฮฉ(\sqrt{dT})$ rate for adversarial linear bandits on the unit Euclidean ball, which is of independent interest.
Spectral-Aware Text-to-Time Series Generation with Billion-Scale Multimodal Meteorological Data
Text-to-time-series generation is particularly important in meteorology, where natural language offers intuitive control over complex, multi-scale atmospheric dynamics. Existing approaches are constrained by the lack of large-scale, physically grounded multimodal datasets and by architectures that overlook the spectral-temporal structure of weather signals. We address these challenges with a unified framework for text-guided meteorological time-series generation. First, we introduce MeteoCap-3B, a billion-scale weather dataset paired with expert-level captions constructed via a Multi-agent Collaborative Captioning (MACC) pipeline, yielding information-dense and physically consistent annotations. Building on this dataset, we propose MTransformer, a diffusion-based model that enables precise semantic control by mapping textual descriptions into multi-band spectral priors through a Spectral Prompt Generator, which guides generation via frequency-aware attention. Extensive experiments on real-world benchmarks demonstrate state-of-the-art generation quality, accurate cross-modal alignment, strong semantic controllability, and substantial gains in downstream forecasting under data-sparse and zero-shot settings. Additional results on general time-series benchmarks indicate that the proposed framework generalizes beyond meteorology.
Quantification of Credal Uncertainty: A Distance-Based Approach
Gonzalez-Garcia, Xabier, Chau, Siu Lun, Rodemann, Julian, Caprio, Michele, Muandet, Krikamol, Bustince, Humberto, Destercke, Sรฉbastien, Hรผllermeier, Eyke, Sale, Yusuf
Credal sets, i.e., closed convex sets of probability measures, provide a natural framework to represent aleatoric and epistemic uncertainty in machine learning. Yet how to quantify these two types of uncertainty for a given credal set, particularly in multiclass classification, remains underexplored. In this paper, we propose a distance-based approach to quantify total, aleatoric, and epistemic uncertainty for credal sets. Concretely, we introduce a family of such measures within the framework of Integral Probability Metrics (IPMs). The resulting quantities admit clear semantic interpretations, satisfy natural theoretical desiderata, and remain computationally tractable for common choices of IPMs. We instantiate the framework with the total variation distance and obtain simple, efficient uncertainty measures for multiclass classification. In the binary case, this choice recovers established uncertainty measures, for which a principled multiclass generalization has so far been missing. Empirical results confirm practical usefulness, with favorable performance at low computational cost.
Vertical Consensus Inference for High-Dimensional Random Partition
Nguyen, Khai, Ni, Yang, Mueller, Peter
We review recently proposed Bayesian approaches for clustering high-dimensional data. After identifying the main limitations of available approaches, we introduce an alternative framework based on vertical consensus inference (VCI) to mitigate the curse of dimensionality in high-dimensional Bayesian clustering. VCI builds on the idea of consensus Monte Carlo by dividing the data into multiple shards (smaller subsets of variables), performing posterior inference on each shard, and then combining the shard-level posteriors to obtain a consensus posterior. The key distinction is that VCI splits the data vertically, producing vertical shards that retain the same number of observations but have lower dimensionality. We use an entropic regularized Wasserstein barycenter to define a consensus posterior. The shard-specific barycenter weights are constructed to favor shards that provide meaningful partitions, distinct from a trivial single cluster or all singleton clusters, favoring balanced cluster sizes and precise shard-specific posterior random partitions. We show that VCI can be interpreted as a variational approximation to the posterior under a hierarchical model with a generalized Bayes prior. For relatively low-dimensional problems, experiments suggest that VCI closely approximates inference based on clustering the entire multivariate data. For high-dimensional data and in the presence of many noninformative dimensions, VCI introduces a new framework for model-based and principled inference on random partitions. Although our focus here is on random partitions, VCI can be applied to any dimension-independent parameters and serves as a bridge to emerging areas in statistics such as consensus Monte Carlo, optimal transport, variational inference, and generalized Bayes.
Mixture-Model Preference Learning for Many-Objective Bayesian Optimization
Dubey, Manisha, De Peuter, Sebastiaan, Wang, Wanrong, Kaski, Samuel
Preference-based many-objective optimization faces two obstacles: an expanding space of trade-offs and heterogeneous, context-dependent human value structures. Towards this, we propose a Bayesian framework that learns a small set of latent preference archetypes rather than assuming a single fixed utility function, modelling them as components of a Dirichlet-process mixture with uncertainty over both archetypes and their weights. To query efficiently, we designing hybrid queries that target information about (i) mode identity and (ii) within-mode trade-offs. Under mild assumptions, we provide a simple regret guarantee for the resulting mixture-aware Bayesian optimization procedure. Empirically, our method outperforms standard baselines on synthetic and real-world many-objective benchmarks, and mixture-aware diagnostics reveal structure that regret alone fails to capture.
Diagnosing Non-Markovian Observations in Reinforcement Learning via Prediction-Based Violation Scoring
Reinforcement learning algorithms assume that observations satisfy the Markov property, yet real-world sensors frequently violate this assumption through correlated noise, latency, or partial observability. Standard performance metrics conflate Markov breakdowns with other sources of suboptimality, leaving practitioners without diagnostic tools for such violations. This paper introduces a prediction-based scoring method that quantifies non-Markovian structure in observation trajectories. A random forest first removes nonlinear Markov-compliant dynamics; ridge regression then tests whether historical observations reduce prediction error on the residuals beyond what the current observation provides. The resulting score is bounded in [0, 1] and requires no causal graph construction. Evaluation spans six environments (CartPole, Pendulum, Acrobot, HalfCheetah, Hopper, Walker2d), three algorithms (PPO, A2C, SAC), controlled AR(1) noise at six intensity levels, and 10 seeds per condition. In post-hoc detection, 7 of 16 environment-algorithm pairs, primarily high-dimensional locomotion tasks, show significant positive monotonicity between noise intensity and the violation score (Spearman rho up to 0.78, confirmed under repeated-measures analysis); under training-time noise, 13 of 16 pairs exhibit statistically significant reward degradation. An inversion phenomenon is documented in low-dimensional environments where the random forest absorbs the noise signal, causing the score to decrease as true violations grow, a failure mode analyzed in detail. A practical utility experiment demonstrates that the proposed score correctly identifies partial observability and guides architecture selection, fully recovering performance lost to non-Markovian observations. Source code to reproduce all results is provided at https://github.com/NAVEENMN/Markovianes.
Machine Learning-Assisted High-Dimensional Matrix Estimation
Tian, Wan, Yang, Hui, Lian, Zhouhui, Zhang, Lingyue, Peng, Yijie
Efficient estimation of high-dimensional matrices--including covariance and precision matrices--is a cornerstone of modern multivariate statistics. Most existing studies have focused primarily on the theoretical properties of the estimators (e.g., consistency and sparsity), while largely overlooking the computational challenges inherent in high-dimensional settings. Theoretically, we first prove the convergence of LADMM, and then establish the convergence, convergence rate, and monotonicity of its reparameterized counterpart; importantly, we show that the reparameterized LADMM enjoys a faster convergence rate. Notably, the proposed reparameterization theory and methodology are applicable to the estimation of both high-dimensional covariance and precision matrices. Keywords: ADMM; High-dimensional; Learning-based optimization; Matrix estimation. 1. Introduction High-dimensional matrix estimation--covering both covariance and precision matrix estimation--constitutes a cornerstone of modern statistics and data science [1, 2, 3]. Accurate covariance estimation enables the characterization of dependence structures among a large number of variables [4, 5, 6], which is indispensable in diverse domains such as genomics [7, 8], neuroscience [9], finance [10, 11, 12], and climate science [13, 14]. Over the past two decades, substantial progress has been made in the statistical theory of high-dimensional matrix estimation, particularly with respect to the accuracy of estimators, including properties such as sparsistency and consistency [5, 15, 16]. However, in empirical studies, the dimensionality is often only on the order of tens to hundreds, and in many cases is comparable to the sample size [21, 22, 23, 24]. This observation highlights a notable gap between the statistical theory of estimators and the practical challenges of their computational implementation.
A Comparative Investigation of Thermodynamic Structure-Informed Neural Networks
Physics-informed neural networks (PINNs) offer a unified framework for solving both forward and inverse problems of differential equations, yet their performance and physical consistency strongly depend on how governing laws are incorporated. In this work, we present a systematic comparison of different thermodynamic structure-informed neural networks by incorporating various thermodynamics formulations, including Newtonian, Lagrangian, and Hamiltonian mechanics for conservative systems, as well as the Onsager variational principle and extended irreversible thermodynamics for dissipative systems. Through comprehensive numerical experiments on representative ordinary and partial differential equations, we quantitatively evaluate the impact of these formulations on accuracy, physical consistency, noise robustness, and interpretability. The results show that Newtonian-residual-based PINNs can reconstruct system states but fail to reliably recover key physical and thermodynamic quantities, whereas structure-preserving formulation significantly enhances parameter identification, thermodynamic consistency, and robustness. These findings provide practical guidance for principled design of thermodynamics-consistency model, and lay the groundwork for integrating more general nonequilibrium thermodynamic structures into physics-informed machine learning.
Energy Score-Guided Neural Gaussian Mixture Model for Predictive Uncertainty Quantification
Yang, Yang, Ji, Chunlin, Li, Haoyang, Deng, Ke
Quantifying predictive uncertainty is essential for real world machine learning applications, especially in scenarios requiring reliable and interpretable predictions. Many common parametric approaches rely on neural networks to estimate distribution parameters by optimizing the negative log likelihood. However, these methods often encounter challenges like training instability and mode collapse, leading to poor estimates of the mean and variance of the target output distribution. In this work, we propose the Neural Energy Gaussian Mixture Model (NE-GMM), a novel framework that integrates Gaussian Mixture Model (GMM) with Energy Score (ES) to enhance predictive uncertainty quantification. NE-GMM leverages the flexibility of GMM to capture complex multimodal distributions and leverages the robustness of ES to ensure well calibrated predictions in diverse scenarios. We theoretically prove that the hybrid loss function satisfies the properties of a strictly proper scoring rule, ensuring alignment with the true data distribution, and establish generalization error bounds, demonstrating that the model's empirical performance closely aligns with its expected performance on unseen data. Extensive experiments on both synthetic and real world datasets demonstrate the superiority of NE-GMM in terms of both predictive accuracy and uncertainty quantification.