trajgpt
Building a Foundation Model for Trajectory from Scratch
Merten, Gaspard, Sakr, Mahmoud, Dejaegere, Gilles
Foundation models are transformative in artificial intelligence, but building them from scratch, especially for mobility trajectories, is not yet clear or documented. This tutorial bridges this gap by demonstrating the steps and code of a minimal implementation of a trajectory-focused foundation model starting from GPT-2. Through a concise, step-by-step, code-driven process, we demonstrate adapting GPT-2 for spatiotemporal data. We then review and compare representative trajectory foundation models, such as TrajFM and TrajGPT, highlighting their architectural innovations and differences. Additionally, we introduce complementary techniques from related domains, like TimesFM's patching approach. Targeted at researchers and practitioners, this tutorial aims to explain the concepts and terminology of foundation models, at the implementation level. We find it timely and indispensable to create this educational material in order to support the SIGSPATIAL community in building and evaluating mobility foundation models, enhancing both research clarity and peer-review effectiveness in mobility AI.
- North America > United States > Minnesota > Hennepin County > Minneapolis (0.16)
- Europe > Belgium > Brussels-Capital Region > Brussels (0.05)
- North America > United States > New York > New York County > New York City (0.05)
TrajGPT: Controlled Synthetic Trajectory Generation Using a Multitask Transformer-Based Spatiotemporal Model
Hsu, Shang-Ling, Tung, Emmanuel, Krumm, John, Shahabi, Cyrus, Shafique, Khurram
Human mobility modeling from GPS-trajectories and synthetic trajectory generation are crucial for various applications, such as urban planning, disaster management and epidemiology. Both of these tasks often require filling gaps in a partially specified sequence of visits - a new problem that we call "controlled" synthetic trajectory generation. Existing methods for next-location prediction or synthetic trajectory generation cannot solve this problem as they lack the mechanisms needed to constrain the generated sequences of visits. Moreover, existing approaches (1) frequently treat space and time as independent factors, an assumption that fails to hold true in real-world scenarios, and (2) suffer from challenges in accuracy of temporal prediction as they fail to deal with mixed distributions and the inter-relationships of different modes with latent variables (e.g., day-of-the-week). These limitations become even more pronounced when the task involves filling gaps within sequences instead of solely predicting the next visit. We introduce TrajGPT, a transformer-based, multi-task, joint spatiotemporal generative model to address these issues. Taking inspiration from large language models, TrajGPT poses the problem of controlled trajectory generation as that of text infilling in natural language. TrajGPT integrates the spatial and temporal models in a transformer architecture through a Bayesian probability model that ensures that the gaps in a visit sequence are filled in a spatiotemporally consistent manner. Our experiments on public and private datasets demonstrate that TrajGPT not only excels in controlled synthetic visit generation but also outperforms competing models in next-location prediction tasks - Relatively, TrajGPT achieves a 26-fold improvement in temporal accuracy while retaining more than 98% of spatial accuracy on average.
- North America > United States > California > Los Angeles County > Los Angeles (0.28)
- North America > United States > Georgia > Fulton County > Atlanta (0.05)
- North America > United States > Virginia > Loudoun County > Ashburn (0.04)
- (4 more...)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
TrajGPT: Irregular Time-Series Representation Learning for Health Trajectory Analysis
Song, Ziyang, Lu, Qingcheng, Zhu, He, Buckeridge, David, Li, Yue
In many domains, such as healthcare, time-series data is often irregularly sampled with varying intervals between observations. This poses challenges for classical time-series models that require equally spaced data. To address this, we propose a novel time-series Transformer called Trajectory Generative Pre-trained Transformer (TrajGPT). TrajGPT employs a novel Selective Recurrent Attention (SRA) mechanism, which utilizes a data-dependent decay to adaptively filter out irrelevant past information based on contexts. By interpreting TrajGPT as discretized ordinary differential equations (ODEs), it effectively captures the underlying continuous dynamics and enables time-specific inference for forecasting arbitrary target timesteps. Experimental results demonstrate that TrajGPT excels in trajectory forecasting, drug usage prediction, and phenotype classification without requiring task-specific fine-tuning. By evolving the learned continuous dynamics, TrajGPT can interpolate and extrapolate disease risk trajectories from partially-observed time series. The visualization of predicted health trajectories shows that TrajGPT forecasts unseen diseases based on the history of clinically relevant phenotypes (i.e., contexts). Time-series representation learning plays a crucial role in various domains, as it facilitates the extraction of generalizable temporal patterns from large-scale, unlabeled data, which can then be adapted for diverse tasks (Ma et al., 2023).
- North America > Canada > Quebec > Montreal (0.14)
- North America > United States > New York > New York County > New York City (0.04)
- North America > United States > Massachusetts > Suffolk County > Boston (0.04)
- Health & Medicine > Consumer Health (1.00)
- Health & Medicine > Therapeutic Area > Endocrinology > Diabetes (0.48)
- Health & Medicine > Health Care Technology > Medical Record (0.47)
- Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (0.46)