Goto

Collaborating Authors

 Learning Graphical Models


Learning Energy-Based Prior Model with Diffusion-Amortized MCMC Peiyu Y u

Neural Information Processing Systems

Latent space Energy-Based Models (EBMs), also known as energy-based priors, have drawn growing interests in the field of generative modeling due to its flexibility in the formulation and strong modeling power of the latent space. However, the common practice of learning latent space EBMs with non-convergent short-run MCMC for prior and posterior sampling is hindering the model from further progress; the degenerate MCMC sampling quality in practice often leads to degraded generation quality and instability in training, especially with highly multi-modal and/or high-dimensional target distributions. To remedy this sampling issue, in this paper we introduce a simple but effective diffusion-based amortization method for long-run MCMC sampling and develop a novel learning algorithm for the latent space EBM based on it. We provide theoretical evidence that the learned amortization of MCMC is a valid long-run MCMC sampler.






DPMM-CFL: Clustered Federated Learning via Dirichlet Process Mixture Model Nonparametric Clustering

arXiv.org Machine Learning

Clustered Federated Learning (CFL) improves performance under non-IID client heterogeneity by clustering clients and training one model per cluster, thereby balancing between a global model and fully personalized models. However, most CFL methods require the number of clusters K to be fixed a priori, which is impractical when the latent structure is unknown. We propose DPMM-CFL, a CFL algorithm that places a Dirichlet Process (DP) prior over the distribution of cluster parameters. This enables nonparametric Bayesian inference to jointly infer both the number of clusters and client assignments, while optimizing per-cluster federated objectives. This results in a method where, at each round, federated updates and cluster inferences are coupled, as presented in this paper. The algorithm is validated on benchmark datasets under Dirichlet and class-split non-IID partitions.


jmstate, a Flexible Python Package for Multi-State Joint Modeling

arXiv.org Machine Learning

Classical joint modeling approaches often rely on competing risks or recurrent event formulations to account for complex real-world processes involving evolving longitudinal markers and discrete event occurrences. However, these frameworks typically capture only limited aspects of the underlying event dynamics. Multi-state joint models offer a more flexible alternative by representing full event histories through a network of possible transitions, including recurrent cycles and terminal absorptions, all potentially influenced by longitudinal covariates. In this paper, we propose a general framework that unifies longitudinal biomarker modeling with multi-state event processes defined on arbitrary directed graphs. Our approach accommodates both Markovian and semi-Markovian transition structures, and extends classical joint models by coupling nonlinear mixed-effects longitudinal submodels with multi-state survival processes via shared latent structures. We derive the full likelihood and develop scalable inference procedures based on stochastic gradient descent. Furthermore, we introduce a dynamic prediction framework, enabling individualized risk assessments along complex state-transition trajectories. To facilitate reproducibility and dissemination, we provide an open-source Python library \texttt{jmstate} implementing the proposed methodology, available on \href{https://pypi.org/project/jmstate/}{PyPI}. Simulation experiments and a biomedical case study demonstrate the flexibility and performance of the framework in representing complex longitudinal and multi-state event dynamics. The full Python notebooks used to reproduce the experiments as well as the source code of this paper are available on \href{https://gitlab.com/felixlaplante0/jmstate-paper/}{GitLab}.


Bayesian Nonparametric Dynamical Clustering of Time Series

arXiv.org Machine Learning

Abstract--We present a method that models the evolution of an unbounded number of time series clusters by switching among an unknown number of regimes with linear dynamics. We develop a Bayesian non-parametric approach using a hierarchical Dirichlet process as a prior on the parameters of a Switching Linear Dynamical System and a Gaussian process prior to model the statistical variations in amplitude and temporal alignment within each cluster . By modeling the evolution of time series patterns, the method avoids unnecessary proliferation of clusters in a principled manner . We perform inference by formulating a variational lower bound for off-line and on-line scenarios, enabling efficient learning through optimization. We illustrate the versatility and effectiveness of the approach through several case studies of electrocardiogram analysis using publicly available databases. Index T erms--Time series analysis, Bayesian methods, Gaussian processes, linear dynamical systems, Dirichlet processes, unsupervised learning, electrocardiogram, arrhythmia detection. IME series data analysis has come to pervade all scientific and technological domains, driven by the need to understand change over time. With the growing availability of such data, machine learning has assumed an increasingly central role in a wide variety of tasks which fall under the category of pattern recognition. Particularly, there is growing interest in identifying similar behaviors in time series data as a preliminary step towards generating insights into the dynamics of the underlying processes. Some recent methodologies can be found for characterizing sea wave conditions [1], transcriptome-wide gene expression profiling [2], selecting stocks with different share price performance [3], and discovering human motion primitives [4].


Q-Learning with Fine-Grained Gap-Dependent Regret

arXiv.org Machine Learning

We study fine-grained gap-dependent regret bounds for model-free reinforcement learning in episodic tabular Markov Decision Processes. Existing model-free algorithms achieve minimax worst-case regret, but their gap-dependent bounds remain coarse and fail to fully capture the structure of suboptimality gaps. We address this limitation by establishing fine-grained gap-dependent regret bounds for both UCB-based and non-UCB-based algorithms. In the UCB-based setting, we develop a novel analytical framework that explicitly separates the analysis of optimal and suboptimal state-action pairs, yielding the first fine-grained regret upper bound for UCB-Hoeffding (Jin et al., 2018). To highlight the generality of this framework, we introduce ULCB-Hoeffding, a new UCB-based algorithm inspired by AMB (Xu et al.,2021) but with a simplified structure, which enjoys fine-grained regret guarantees and empirically outperforms AMB. In the non-UCB-based setting, we revisit the only known algorithm AMB, and identify two key issues in its algorithm design and analysis: improper truncation in the $Q$-updates and violation of the martingale difference condition in its concentration argument. We propose a refined version of AMB that addresses these issues, establishing the first rigorous fine-grained gap-dependent regret for a non-UCB-based method, with experiments demonstrating improved performance over AMB.


Scalable Policy-Based RL Algorithms for POMDPs

arXiv.org Machine Learning

The continuous nature of belief states in POMDPs presents significant computational challenges in learning the optimal policy. In this paper, we consider an approach that solves a Partially Observable Reinforcement Learning (PORL) problem by approximating the corresponding POMDP model into a finite-state Markov Decision Process (MDP) (called Superstate MDP). We first derive theoretical guarantees that improve upon prior work that relate the optimal value function of the transformed Superstate MDP to the optimal value function of the original POMDP. Next, we propose a policy-based learning approach with linear function approximation to learn the optimal policy for the Superstate MDP. Consequently, our approach shows that a POMDP can be approximately solved using TD-learning followed by Policy Optimization by treating it as an MDP, where the MDP state corresponds to a finite history. We show that the approximation error decreases exponentially with the length of this history. To the best of our knowledge, our finite-time bounds are the first to explicitly quantify the error introduced when applying standard TD learning to a setting where the true dynamics are not Markovian.