Bayesian Inference
Smooth Flow Matching
Functional data, i.e., smooth random functions observed over a continuous domain, are increasingly available in areas such as biomedical research, health informatics, and epidemiology. However, effective statistical analysis for functional data is often hindered by challenges such as privacy constraints, sparse and irregular sampling, infinite dimensionality, and non-Gaussian structures. To address these challenges, we introduce a novel framework named Smooth Flow Matching (SFM), tailored for generative modeling of functional data to enable statistical analysis without exposing sensitive real data. Built upon flow-matching ideas, SFM constructs a semiparametric copula flow to generate infinite-dimensional functional data, free from Gaussianity or low-rank assumptions. It is computationally efficient, handles irregular observations, and guarantees the smoothness of the generated functions, offering a practical and flexible solution in scenarios where existing deep generative methods are not applicable. Through extensive simulation studies, we demonstrate the advantages of SFM in terms of both synthetic data quality and computational efficiency. We then apply SFM to generate clinical trajectory data from the MIMIC-IV patient electronic health records (EHR) longitudinal database. Our analysis showcases the ability of SFM to produce high-quality surrogate data for downstream statistical tasks, highlighting its potential to boost the utility of EHR data for clinical applications.
A PC Algorithm for Max-Linear Bayesian Networks
Amรฉndola, Carlos, Hollering, Benjamin, Nowell, Francesco
Max-linear Bayesian networks (MLBNs) are a relatively recent class of structural equation models which arise when the random variables involved have heavy-tailed distributions. Unlike most directed graphical models, MLBNs are typically not faithful to d-separation and thus classical causal discovery algorithms such as the PC algorithm or greedy equivalence search can not be used to accurately recover the true graph structure. In this paper, we begin the study of constraint-based discovery algorithms for MLBNs given an oracle for testing conditional independence in the true, unknown graph. We show that if the oracle is given by the $\ast$-separation criteria in the true graph, then the PC algorithm remains consistent despite the presence of additional CI statements implied by $\ast$-separation. We also introduce a new causal discovery algorithm named "PCstar" which assumes faithfulness to $C^\ast$-separation and is able to orient additional edges which cannot be oriented with only d- or $\ast$-separation.
Order Optimal Regret Bounds for Sharpe Ratio Optimization in the Bandit Setting
Shah, Mohammad Taha, Khurshid, Sabrina, Ghatak, Gourab
In this paper, we investigate the problem of sequential decision-making for Sharpe ratio (SR) maximization in a stochastic bandit setting. We focus on the Thompson Sampling (TS) algorithm, a Bayesian approach celebrated for its empirical performance and exploration efficiency, under the assumption of Gaussian rewards with unknown parameters. Unlike conventional bandit objectives focusing on maximizing cumulative reward, Sharpe ratio optimization instead introduces an inherent tradeoff between achieving high returns and controlling risk, demanding careful exploration of both mean and variance. Our theoretical contributions include a novel regret decomposition specifically designed for the Sharpe ratio, highlighting the role of information acquisition about the reward distribution in driving learning efficiency. Then, we establish fundamental performance limits for the proposed algorithm \texttt{SRTS} in terms of an upper bound on regret. We also derive the matching lower bound and show the order-optimality. Our results show that Thompson Sampling achieves logarithmic regret over time, with distribution-dependent factors capturing the difficulty of distinguishing arms based on risk-adjusted performance. Empirical simulations show that our algorithm significantly outperforms existing algorithms.
Uncertainty Tube Visualization of Particle Trajectories
Li, Jixian, Ouermi, Timbwaoga Aime Judicael, Han, Mengjiao, Johnson, Chris R.
This figure compares (a) a spaghetti plot of ensemble members, (b) a circular tube, and (c) our uncertainty tube for visualizing model uncertainty. Previous methods face challenges such as visual clutter (a) or the assumption of symmetric uncertainty (a, b), but our uncertainty tube (c), constructed using superellipses, provides a more accurate visualization of asymmetric uncertainty. Its superelliptical shape distinctly improves the visualization of the uncertainty orientation and its evolution along trajectories, as highlighted in the boxes. The visualization is further enhanced with a color palette that uses gray for low uncertainty, blue for large asymmetric uncertainty, and yellow for large symmetric uncertainty. Predicting particle trajectories with neural networks (NNs) has substantially enhanced many scientific and engineering domains. However, effectively quantifying and visualizing the inherent uncertainty in predictions remains challenging. Without an understanding of the uncertainty, the reliability of NN models in applications where trustworthiness is paramount is significantly compromised. This paper introduces the uncertainty tube, a novel, computationally efficient visualization method designed to represent this uncertainty in NN-derived particle paths. By integrating well-established uncertainty quantification techniques, such as Deep Ensembles, Monte Carlo Dropout (MC Dropout), and Stochastic Weight Averaging-Gaussian (SW AG), we demonstrate the practical utility of the uncertainty tube, showcasing its application on both synthetic and simulation datasets. Understanding and analyzing flow field data is fundamental for numerous scientific and engineering disciplines, including fluid dynamics, atmospheric science, and material processing. Traditional computational fluid dynamics (CFD) simulations are often computationally intensive, a limitation that has led researchers to explore more efficient paradigms. This exploration has given rise to neural networks (NNs) as a transformative tool in this domain, driven by their capacity to overcome these computational bottlenecks. Notably, recent work, such as Han et al. [26, 27], leverages NNs to learn Lagrangian-based flow maps, enabling efficient and robust particle tracing in time-varying fields. These data-driven models demonstrate remarkable accuracy and speed, making them increasingly indispensable for accelerating discovery and design cycles in fluid dynamics. Despite these advancements, a significant challenge remains in providing a comprehensive understanding of the confidence associated with NN predictions in flow fields.
Deep Graph Neural Point Process For Learning Temporal Interactive Networks
Chen, Su, Qi, Xiaohua, Lin, Xixun, Shang, Yanmin, Xu, Xiaolin, Li, Yangxi
Learning temporal interaction networks(TIN) is previously regarded as a coarse-grained multi-sequence prediction problem, ignoring the network topology structure influence. This paper addresses this limitation and a Deep Graph Neural Point Process(DGNPP) model for TIN is proposed. DGNPP consists of two key modules: the Node Aggregation Layer and the Self Attentive Layer. The Node Aggregation Layer captures topological structures to generate static representation for users and items, while the Self Attentive Layer dynamically updates embeddings over time. By incorporating both dynamic and static embeddings into the event intensity function and optimizing the model via maximum likelihood estimation, DGNPP predicts events and occurrence time effectively. Experimental evaluations on three public datasets demonstrate that DGNPP achieves superior performance in event prediction and time prediction tasks with high efficiency, significantly outperforming baseline models and effectively mitigating the limitations of prior approaches.