Not enough data to create a plot.
Try a different view from the menu above.
Prakash, B. Aditya
PEMS: Pre-trained Epidemic Time-series Models
Kamarthi, Harshavardhan, Prakash, B. Aditya
Providing accurate and reliable predictions about the future of an epidemic is an important problem for enabling informed public health decisions. Recent works have shown that leveraging data-driven solutions that utilize advances in deep learning methods to learn from past data of an epidemic often outperform traditional mechanistic models. However, in many cases, the past data is sparse and may not sufficiently capture the underlying dynamics. While there exists a large amount of data from past epidemics, leveraging prior knowledge from time-series data of other diseases is a non-trivial challenge. Motivated by the success of pre-trained models in language and vision tasks, we tackle the problem of pre-training epidemic time-series models to learn from multiple datasets from different diseases and epidemics. We tackle various important challenges specific to pretraining for epidemic time-series such as dealing with heterogeneous dynamics and efficiently capturing useful patterns from multiple epidemic datasets by carefully designing the SSL tasks to learn important priors about the epidemic dynamics that can be leveraged for fine-tuning to multiple downstream tasks. The resultant PEM outperforms previous state-of-the-art methods in various downstream time-series tasks across datasets of varying seasonal patterns, geography, and mechanism of contagion including the novel Covid-19 pandemic unseen in pre-trained data with better efficiency using smaller fraction of datasets. Predicting the trends of an ongoing epidemic is an important public health problem that influences real-time decision-making affecting millions of people. Forecasting of time series of important epidemic indicators is a well-studied challenging problem (Rodríguez et al., 2022b; Chakraborty et al., 2018). Availability of traditional as well as novel datasets such as testing records, social media, etc. that capture multiple facets of the epidemic as well as advances in machine learning and deep learning in particular have enabled to build models that learn from these datasets and show promising results, often outperforming traditional mechanistic methods (Cramer et al., 2021; Reich et al., 2019). Many public health and research initiatives collect data from various diseases over many decades at various spatial granularities in different geographies.
Large Pre-trained time series models for cross-domain Time series analysis tasks
Kamarthi, Harshavardhan, Prakash, B. Aditya
Large pre-trained models have been instrumental in significant advancements in domains like language and vision making model training for individual downstream tasks more efficient as well as provide superior performance. However, tackling time-series analysis tasks usually involves designing and training a separate model from scratch leveraging training data and domain expertise specific to the task. We tackle a significant challenge for pre-training a general time-series model from multiple heterogeneous time-series dataset: providing semantically useful inputs to models for modeling time series of different dynamics from different domains. We observe that partitioning time-series into segments as inputs to sequential models produces semantically better inputs and propose a novel model LPTM that automatically identifies optimal dataset-specific segmentation strategy leveraging self-supervised learning loss during pre-training. LPTM provides performance similar to or better than domain-specific state-of-art model and is significantly more data and compute efficient taking up to 40% less data as well as 50% less training time to achieve state-of-art performance in a wide range of time-series analysis tasks from multiple disparate domains. Time-series analysis tasks involve important well-studied problems involving time-series datasets such as forecasting (Hyndman & Athanasopoulos, 2018) and classification (Chowdhury et al., 2022) with applications in wide-ranging domains such as retail, meteorology, economics, and health. Recent works (Chen et al., 2021; Wang et al., 2022; Zeng et al., 2023) have shown the efficacy of purely data-driven deep learning models in learning complex domain-specific properties of the time series over traditional statistic and mechanistic models across many domains. However, coming up with a model for a specific application or time-series analysis task is usually non-trivial.
When Rigidity Hurts: Soft Consistency Regularization for Probabilistic Hierarchical Time Series Forecasting
Kamarthi, Harshavardhan, Kong, Lingkai, Rodríguez, Alexander, Zhang, Chao, Prakash, B. Aditya
Probabilistic hierarchical time-series forecasting is an important variant of time-series forecasting, where the goal is to model and forecast multivariate time-series that have underlying hierarchical relations. Most methods focus on point predictions and do not provide well-calibrated probabilistic forecasts distributions. Recent state-of-art probabilistic forecasting methods also impose hierarchical relations on point predictions and samples of distribution which does not account for coherency of forecast distributions. Previous works also silently assume that datasets are always consistent with given hierarchical relations and do not adapt to real-world datasets that show deviation from this assumption. We close both these gap and propose PROFHiT, which is a fully probabilistic hierarchical forecasting model that jointly models forecast distribution of entire hierarchy. PROFHiT uses a flexible probabilistic Bayesian approach and introduces a novel Distributional Coherency regularization to learn from hierarchical relations for entire forecast distribution that enables robust and calibrated forecasts as well as adapt to datasets of varying hierarchical consistency. On evaluating PROFHiT over wide range of datasets, we observed 41-88% better performance in accuracy and significantly better calibration. Due to modeling the coherency over full distribution, we observed that PROFHiT can robustly provide reliable forecasts even if up to 10% of input time-series data is missing where other methods' performance severely degrade by over 70%.
PINNsFormer: A Transformer-Based Framework For Physics-Informed Neural Networks
Zhao, Zhiyuan, Ding, Xueying, Prakash, B. Aditya
Physics-Informed Neural Networks (PINNs) have emerged as a promising deep learning framework for approximating numerical solutions to partial differential equations (PDEs). However, conventional PINNs, relying on multilayer perceptrons (MLP), neglect the crucial temporal dependencies inherent in practical physics systems and thus fail to propagate the initial condition constraints globally and accurately capture the true solutions under various scenarios. In this paper, we introduce a novel Transformer-based framework, termed PINNsFormer, designed to address this limitation. PINNsFormer can accurately approximate PDE solutions by utilizing multi-head attention mechanisms to capture temporal dependencies. PINNsFormer transforms point-wise inputs into pseudo sequences and replaces point-wise PINNs loss with a sequential loss. Additionally, it incorporates a novel activation function, Wavelet, which anticipates Fourier decomposition through deep neural networks. Empirical results demonstrate that PINNsFormer achieves superior generalization ability and accuracy across various scenarios, including PINNs failure modes and high-dimensional PDEs. Moreover, PINNsFormer offers flexibility in integrating existing learning schemes for PINNs, further enhancing its performance.
DF2: Distribution-Free Decision-Focused Learning
Kong, Lingkai, Mu, Wenhao, Cui, Jiaming, Zhuang, Yuchen, Prakash, B. Aditya, Dai, Bo, Zhang, Chao
Decision-focused learning (DFL) has recently emerged as a powerful approach for predict-then-optimize problems by customizing a predictive model to a downstream optimization task. However, existing end-to-end DFL methods are hindered by three significant bottlenecks: model mismatch error, sample average approximation error, and gradient approximation error. Model mismatch error stems from the misalignment between the model's parameterized predictive distribution and the true probability distribution. Sample average approximation error arises when using finite samples to approximate the expected optimization objective. Gradient approximation error occurs as DFL relies on the KKT condition for exact gradient computation, while most methods approximate the gradient for backpropagation in non-convex objectives. In this paper, we present DF2 -- the first \textit{distribution-free} decision-focused learning method explicitly designed to address these three bottlenecks. Rather than depending on a task-specific forecaster that requires precise model assumptions, our method directly learns the expected optimization function during training. To efficiently learn the function in a data-driven manner, we devise an attention-based model architecture inspired by the distribution-based parameterization of the expected objective. Our method is, to the best of our knowledge, the first to address all three bottlenecks within a single model. We evaluate DF2 on a synthetic problem, a wind power bidding problem, and a non-convex vaccine distribution problem, demonstrating the effectiveness of DF2.
Autoregressive Diffusion Model for Graph Generation
Kong, Lingkai, Cui, Jiaming, Sun, Haotian, Zhuang, Yuchen, Prakash, B. Aditya, Zhang, Chao
Diffusion-based graph generative models have recently obtained promising results for graph generation. However, existing diffusion-based graph generative models are mostly one-shot generative models that apply Gaussian diffusion in the dequantized adjacency matrix space. Such a strategy can suffer from difficulty in model training, slow sampling speed, and incapability of incorporating constraints. We propose an \emph{autoregressive diffusion} model for graph generation. Unlike existing methods, we define a node-absorbing diffusion process that operates directly in the discrete graph space. For forward diffusion, we design a \emph{diffusion ordering network}, which learns a data-dependent node absorbing ordering from graph topology. For reverse generation, we design a \emph{denoising network} that uses the reverse node ordering to efficiently reconstruct the graph by predicting the node type of the new node and its edges with previously denoised nodes at a time. Based on the permutation invariance of graph, we show that the two networks can be jointly trained by optimizing a simple lower bound of data likelihood. Our experiments on six diverse generic graph datasets and two molecule datasets show that our model achieves better or comparable generation performance with previous state-of-the-art, and meanwhile enjoys fast generation speed.
Differentiable Agent-based Epidemiology
Chopra, Ayush, Rodríguez, Alexander, Subramanian, Jayakumar, Quera-Bofarull, Arnau, Krishnamurthy, Balaji, Prakash, B. Aditya, Raskar, Ramesh
Mechanistic simulators are an indispensable tool for epidemiology to explore the behavior of complex, dynamic infections under varying conditions and navigate uncertain environments. Agent-based models (ABMs) are an increasingly popular simulation paradigm that can represent the heterogeneity of contact interactions with granular detail and agency of individual behavior. However, conventional ABM frameworks are not differentiable and present challenges in scalability; due to which it is non-trivial to connect them to auxiliary data sources. In this paper, we introduce GradABM: a scalable, differentiable design for agent-based modeling that is amenable to gradient-based learning with automatic differentiation. GradABM can quickly simulate million-size populations in few seconds on commodity hardware, integrate with deep neural networks and ingest heterogeneous data sources. This provides an array of practical benefits for calibration, forecasting, and evaluating policy interventions. We demonstrate the efficacy of GradABM via extensive experiments with real COVID-19 and influenza datasets.
EINNs: Epidemiologically-informed Neural Networks
Rodríguez, Alexander, Cui, Jiaming, Ramakrishnan, Naren, Adhikari, Bijaya, Prakash, B. Aditya
We introduce EINNs, a framework crafted for epidemic forecasting that builds upon the theoretical grounds provided by mechanistic models as well as the data-driven expressibility afforded by AI models, and their capabilities to ingest heterogeneous information. Although neural forecasting models have been successful in multiple tasks, predictions well-correlated with epidemic trends and long-term predictions remain open challenges. Epidemiological ODE models contain mechanisms that can guide us in these two tasks; however, they have limited capability of ingesting data sources and modeling composite signals. Thus, we propose to leverage work in physics-informed neural networks to learn latent epidemic dynamics and transfer relevant knowledge to another neural network which ingests multiple data sources and has more appropriate inductive bias. In contrast with previous work, we do not assume the observability of complete dynamics and do not need to numerically solve the ODE equations during training. Our thorough experiments on all US states and HHS regions for COVID-19 and influenza forecasting showcase the clear benefits of our approach in both short-term and long-term forecasting as well as in learning the mechanistic dynamics over other non-trivial alternatives.
End-to-End Stochastic Optimization with Energy-Based Model
Kong, Lingkai, Cui, Jiaming, Zhuang, Yuchen, Feng, Rui, Prakash, B. Aditya, Zhang, Chao
Decision-focused learning (DFL) was recently proposed for stochastic optimization problems that involve unknown parameters. By integrating predictive modeling with an implicitly differentiable optimization layer, DFL has shown superior performance to the standard two-stage predict-then-optimize pipeline. However, most existing DFL methods are only applicable to convex problems or a subset of nonconvex problems that can be easily relaxed to convex ones. Further, they can be inefficient in training due to the requirement of solving and differentiating through the optimization problem in every training iteration. We propose SO-EBM, a general and efficient DFL method for stochastic optimization using energy-based models. Instead of relying on KKT conditions to induce an implicit optimization layer, SO-EBM explicitly parameterizes the original optimization problem using a differentiable optimization layer based on energy functions. To better approximate the optimization landscape, we propose a coupled training objective that uses a maximum likelihood loss to capture the optimum location and a distribution-based regularizer to capture the overall energy landscape. Finally, we propose an efficient training procedure for SO-EBM with a self-normalized importance sampler based on a Gaussian mixture proposal. We evaluate SO-EBM in three applications: power scheduling, COVID-19 resource allocation, and non-convex adversarial security game, demonstrating the effectiveness and efficiency of SO-EBM.
CAMul: Calibrated and Accurate Multi-view Time-Series Forecasting
Kamarthi, Harshavardhan, Kong, Lingkai, Rodríguez, Alexander, Zhang, Chao, Prakash, B. Aditya
Probabilistic time-series forecasting enables reliable decision making across many domains. Most forecasting problems have diverse sources of data containing multiple modalities and structures. Leveraging information as well as uncertainty from these data sources for well-calibrated and accurate forecasts is an important challenging problem. Most previous work on multi-modal learning and forecasting simply aggregate intermediate representations from each data view by simple methods of summation or concatenation and do not explicitly model uncertainty for each data-view. We propose a general probabilistic multi-view forecasting framework CAMul, that can learn representations and uncertainty from diverse data sources. It integrates the knowledge and uncertainty from each data view in a dynamic context-specific manner assigning more importance to useful views to model a well-calibrated forecast distribution. We use CAMul for multiple domains with varied sources and modalities and show that CAMul outperforms other state-of-art probabilistic forecasting models by over 25\% in accuracy and calibration.