tsfm
Less is More: Unlocking Specialization of Time Series Foundation Models via Structured Pruning
Scaling laws motivate the development of Time Series Foundation Models (TSFMs) that pre-train vast parameters and achieve remarkable zero-shot forecasting performance. Surprisingly, even after fine-tuning, TSFMs cannot consistently outperform smaller, specialized models trained on full-shot downstream data. A key question is how to realize effective adaptation of TSFMs for a target forecasting task. Through empirical studies on various TSFMs, the pre-trained models often exhibit inherent sparsity and redundancy in computation, suggesting that TSFMs have learned to activate task-relevant network substructures to accommodate diverse forecasting tasks. To preserve this valuable prior knowledge, we propose a structured pruning method to regularize the subsequent fine-tuning process by focusing it on a more relevant and compact parameter space. Extensive experiments on seven TSFMs and six benchmarks demonstrate that fine-tuning a smaller, pruned TSFM significantly improves forecasting performance compared to fine-tuning original models. This ``prune-then-finetune'' paradigm often enables TSFMs to achieve state-of-the-art performance and surpass strong specialized baselines.
Multi-Scale Finetuning for Encoder-based Time Series Foundation Models
Time series foundation models (TSFMs) demonstrate impressive zero-shot performance for time series forecasting. However, an important yet underexplored challenge is how to effectively finetune TSFMs on specific downstream tasks. While naive finetuning can yield performance gains, we argue that it falls short of fully leveraging TSFMs' capabilities, often resulting in overfitting and suboptimal performance. Given the diverse temporal patterns across sampling scales and the inherent multi-scale forecasting capabilities of TSFMs, we adopt a causal perspective to analyze finetuning process, through which we highlight the critical importance of explicitly modeling multiple scales and reveal the shortcomings of naive approaches. Focusing on encoder-based TSFMs, we propose Multiscale finetuning (MSFT), a simple yet general framework that explicitly integrates multi-scale modeling into the finetuning process. Experimental results on three different backbones (Moirai, Moment and Units) demonstrate that TSFMs finetuned with MSFT not only outperform naive and typical parameter efficient finetuning methods but also surpass state-of-the-art deep learning methods. Codes are available at https://github.com/zqiao11/MSFT.
Introducing ARFBench: A time series question-answering benchmark based on real incidents
More than a trillion dollars are lost every year due to system failures. To resolve them, engineers must troubleshoot outages quickly. An important task in incident response involves analyzing observability metrics, or time series data that snapshot the health of software systems. For example, an engineer for a service may use Datadog to answer questions like "When did latency start increasing?" and "What metrics outside of latency are also behaving abnormally?" to localize the root cause of the anomalous behavior. These time series question-answering (TSQA) tasks are essential for engineers, and present challenging and necessary tasks for SRE models and agents to perform.
Universal Redundancies in Time Series Foundation Models
Bao, Anthony, Vattikuti, Venkata Hasith, Lai, Jeffrey, Gilpin, William
Time Series Foundation Models (TSFMs) leverage extensive pretraining to accurately predict unseen time series during inference, without the need for task-specific fine-tuning. Through large-scale evaluations on standard benchmarks, we find that leading transformer-based TSFMs exhibit redundant components in their intermediate layers. We introduce a set of tools for mechanistic interpretability of TSFMs, including ablations of specific components and direct logit attribution on the residual stream. Our findings are consistent across several leading TSFMs with diverse architectures, and across a diverse set of real-world and synthetic time-series datasets. We discover that all models in our study are robust to ablations of entire layers. Furthermore, we develop a theoretical framework framing transformers as kernel regressors, motivating a purely intrinsic strategy for ablating heads based on the stable rank of the per-head projection matrices. Using this approach, we uncover the specific heads responsible for degenerate phenomena widely observed in TSFMs, such as parroting of motifs from the context and seasonality bias. Our study sheds light on the universal properties of this emerging class of architectures for continuous-time sequence modeling.
Time Series Foundation Models for Process Model Forecasting
Yu, Yongbo, Peeperkorn, Jari, De Smedt, Johannes, De Weerdt, Jochen
Process Model Forecasting (PMF) aims to predict how the control-flow structure of a process evolves over time by mode ling the temporal dynamics of directly-follows (DF) relations, comple menting predictive process monitoring that focuses on single-case prefixe s. Prior benchmarks show that machine learning and deep learning models pr ovide only modest gains over statistical baselines, mainly due to the s parsity and heterogeneity of the DF time series. We investigate Time Ser ies Foundation Models (TSFMs), large pre-trained models for generic t ime series, as an alternative for PMF. Using DF time series derived from rea l-life event logs, we compare zero-shot use of TSFMs, without additional training, with fine-tuned variants adapted on PMF-specific data. TSFMs generally achieve lower forecasting errors (MAE and RMSE) than tradit ional and specialized models trained from scratch on the same logs, in dicating effective transfer of temporal structure from non-process do mains. While fine-tuning can further improve accuracy, the gains are ofte n small and may disappear on smaller or more complex datasets, so zero-s hot use remains a strong default. Our study highlights the generaliza tion capability and data efficiency of TSFMs for process-related time series a nd, to the best of our knowledge, provides the first systematic evaluat ion of temporal foundation models for PMF.
Are Time-Series Foundation Models Deployment-Ready? A Systematic Study of Adversarial Robustness Across Domains
Zhang, Jiawen, Zhang, Zhenwei, Zheng, Shun, Wen, Xumeng, Li, Jia, Bian, Jiang
Time-Series Foundation Models (TSFMs) are rapidly transitioning from research prototypes to core components of critical decision-making systems, driven by their impressive zero-shot forecasting capabilities. However, as their deployment surges, a critical blind spot remains: their fragility under adversarial attacks. This lack of scrutiny poses severe risks, particularly as TSFMs enter high-stakes environments vulnerable to manipulation. We present a systematic, diagnostic study arguing that for TSFMs, robustness is not merely a secondary metric but a prerequisite for trustworthy deployment comparable to accuracy. Our evaluation framework, explicitly tailored to the unique constraints of time series, incorporates normalized, sparsity-aware perturbation budgets and unified scale-invariant metrics across white-box and black-box settings. Across six representative TSFMs, we demonstrate that current architectures are alarmingly brittle: even small perturbations can reliably steer forecasts toward specific failure modes, such as trend flips and malicious drifts. We uncover TSFM-specific vulnerability patterns, including horizon-proximal brittleness, increased susceptibility with longer context windows, and weak cross-model transfer that points to model-specific failure modes rather than generic distortions. Finally, we show that simple adversarial fine-tuning offers a cost-effective path to substantial robustness gains, even with out-of-domain data. This work bridges the gap between TSFM capabilities and safety constraints, offering essential guidance for hardening the next generation of forecasting systems.
TS-HINT: Enhancing Semiconductor Time Series Regression Using Attention Hints From Large Language Model Reasoning
Rico, Jonathan Adam, Raghavan, Nagarajan, Jayavelu, Senthilnath
ABSTRACT Existing data-driven methods rely on the extraction of static features from time series to approximate the material removal rate (MRR) of semiconductor manufacturing processes such as chemical mechanical polishing (CMP). However, this leads to a loss of temporal dynamics. Moreover, these methods require a large amount of data for effective training. In this paper, we propose TS-Hint, a Time Series Foundation Model (TSFM) framework, integrated with chain-of-thought reasoning which provides attention hints during training based on attention mechanism data and saliency data. Experimental results demonstrate the effectiveness of our model in limited data settings via few-shot learning and can learn directly from multivariate time series features.
Evaluating Spatio-Temporal Forecasting Trade-offs Between Graph Neural Networks and Foundation Models
Gupta, Ragini, Raina, Naman, Chen, Bo, Chen, Li, Danilov, Claudiu, Eckhardt, Josh, Bernard, Keyshla, Nahrstedt, Klara
Modern IoT deployments for environmental sensing produce high volume spatiotemporal data to support downstream tasks such as forecasting, typically powered by machine learning models. While existing filtering and strategic deployment techniques optimize collected data volume at the edge, they overlook how variations in sampling frequencies and spatial coverage affect downstream model performance. In many forecasting models, incorporating data from additional sensors denoise predictions by providing broader spatial contexts. This interplay between sampling frequency, spatial coverage and different forecasting model architectures remain underexplored. This work presents a systematic study of forecasting models - classical models (VAR), neural networks (GRU, Transformer), spatio-temporal graph neural networks (STGNNs), and time series foundation models (TSFMs: Chronos Moirai, TimesFM) under varying spatial sensor nodes density and sampling intervals using real-world temperature data in a wireless sensor network. Our results show that STGNNs are effective when sensor deployments are sparse and sampling rate is moderate, leveraging spatial correlations via encoded graph structure to compensate for limited coverage. In contrast, TSFMs perform competitively at high frequencies but degrade when spatial coverage from neighboring sensors is reduced. Crucially, the multivariate TSFM Moirai outperforms all models by natively learning cross-sensor dependencies. These findings offer actionable insights for building efficient forecasting pipelines in spatio-temporal systems. All code for model configurations, training, dataset, and logs are open-sourced for reproducibility: https://github.com/UIUC-MONET-Projects/Benchmarking-Spatiotemporal-Forecast-Models
Uncovering Zero-Shot Generalization Gaps in Time-Series Foundation Models Using Real-World Videos
Li, Lujun, Sleem, Lama, Wang, Yiqun, Xu, Yangjie, Gentile, Niccolò, State, Radu
Recent research on time-series foundation models (TSFMs) has underscored the scarcity of real-world data, often supplemented with synthetic sources in existing datasets, whose generalizability remains however debated. As such, in this work, we propose a novel benchmarking approach: in particular, we aim at building a curated dataset reflecting real world physical temporal dynamics, extracting temporal signals from real-world videos using optical flow. As such, we introduce REAL-V-TSFM, a novel dataset designed to capture rich and diverse time series derived from real-world videos. Experimental results on state-of-the-art TSFMs under zero-shot forecasting show that, despite strong performance on conventional benchmarks, these models exhibit performance degradation on the proposed dataset, suggesting limited generaliz-ability to novel datasets. These findings underscore the need for novel approaches to acquiring time series data and highlight the lack of universality in recent TSFMs, while further validating the effectiveness of our video-based time series data extraction pipeline.
Cisco Time Series Model Technical Report
Gou, Liang, Khare, Archit, Pabolu, Praneet, Patel, Prachi, Ross, Joseph, Shen, Hercy, Yuhan, null, Song, null, Sun, Jingze, Curtis, Kristal, Dharnidharka, Vedant, Mathur, Abhinav, Yang, Hao
Modern LLMs are capable of learning complex statistical properties of language from a vast corpus of text. Rather than being trained to emulate a particular style or perform a particular task, they learn structure across diverse examples of token sequences, and the learned representations can be transferred to many downstream tasks and applications. The main idea of a time series foundation model (TSFM) is to apply the same playbook - including the transformer architecture that has revolutionized natural language processing - to sequences of numerical data, i.e., time series. Our present focus is to train a univariate TSFM capable of high-quality zero-shot forecasting, with emphasis on time series arising in certain business domains (initially, observability). Thus, having been exposed to patterns across many time series during training, given a segment of a new (unseen) time series, the TSFM is expected to predict its subsequent segment without any auxiliary parameter adjustment or fitting. Architectural differences among TSFMs can be found in their approaches to tokenization, transformer configuration, and prediction heads. PatchTST [Nie+23] introduces the idea of a time series patch as the analogue of a token, uses a linear transformation of a patch as a replacement for the token embedding, and finally applies a standard transformer encoder architecture. TimesFM [Das+24] uses a residual block to embed time series patches, enabling learning of more complex representations, and applies a decoder-only architecture. Chronos [Ans+24] tokenizes individual data points via scaling and then applies the (encoder-decoder) T5 architecture [Raf+20], notably formulating forecasting as a classification problem; subsequent versions (Chronos-Bolt, Chronos-2 [Ans+25]) utilize patching and "meta features" before applying transformer layers, and Chronos-2 uses a T5 encoder.