variate
- North America > United States (0.28)
- North America > Trinidad and Tobago > Trinidad > Arima > Arima (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- (2 more...)
- Health & Medicine (1.00)
- Government (0.67)
- Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
- Information Technology > Artificial Intelligence > Natural Language (0.93)
- Information Technology > Data Science > Data Mining (0.70)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Asia > Singapore (0.04)
- Asia > India > Maharashtra > Mumbai (0.04)
- Information Technology > Data Science > Data Mining > Big Data (1.00)
- Information Technology > Artificial Intelligence > Machine Learning (1.00)
- Europe > France (0.04)
- Europe > Belgium (0.04)
- North America > United States > California > Orange County > Irvine (0.04)
193002e668758ea9762904da1a22337c-Supplemental.pdf
Thefirsttwocolumns showresults for two different step-sizes, and the third one using the best step-size chosen retrospectively. The plots show the final ELBO achieved after trainingfor40000stepsvs. stepsizeused. Figure 11: VI using a diagonal Gaussian, with the best step-size chosen retrospectively. Bayesian logistic regression: We use a subset of700 rows of thea1a dataset. In this case the posterior p(z|x) has dimensionality d = 120.
Revitalizing Canonical Pre-Alignment for Irregular Multivariate Time Series Forecasting
Zhou, Ziyu, Huang, Yiming, Wang, Yanyun, Wu, Yuankai, Kwok, James, Liang, Yuxuan
Irregular multivariate time series (IMTS), characterized by uneven sampling and inter-variate asynchrony, fuel many forecasting applications yet remain challenging to model efficiently. Canonical Pre-Alignment (CPA) has been widely adopted in IMTS modeling by padding zeros at every global timestamp, thereby alleviating inter-variate asynchrony and unifying the series length, but its dense zero-padding inflates the pre-aligned series length, especially when numerous variates are present, causing prohibitive compute overhead. Recent graph-based models with patching strategies sidestep CPA, but their local message passing struggles to capture global inter-variate correlations. Therefore, we posit that CPA should be retained, with the pre-aligned series properly handled by the model, enabling it to outperform state-of-the-art graph-based baselines that sidestep CPA. Technically, we propose KAFNet, a compact architecture grounded in CPA for IMTS forecasting that couples (1) Pre-Convolution module for sequence smoothing and sparsity mitigation, (2) Temporal Kernel Aggregation module for learnable compression and modeling of intra-series irregularity, and (3) Frequency Linear Attention blocks for the low-cost inter-series correlations modeling in the frequency domain. Experiments on multiple IMTS datasets show that KAFNet achieves state-of-the-art forecasting performance, with a 7.2$\times$ parameter reduction and a 8.4$\times$ training-inference acceleration.
- North America > United States (0.14)
- Asia > China > Hong Kong (0.04)
- Asia > China > Guangdong Province > Guangzhou (0.04)
This Time is Different: An Observability Perspective on Time Series Foundation Models
Cohen, Ben, Khwaja, Emaad, Doubli, Youssef, Lemaachi, Salahidine, Lettieri, Chris, Masson, Charles, Miccinilli, Hugo, Ramé, Elise, Ren, Qiqi, Rostamizadeh, Afshin, Terrail, Jean Ogier du, Toon, Anna-Monica, Wang, Kan, Xie, Stephan, Xu, Zongzhe, Zhukova, Viktoriya, Asker, David, Talwalkar, Ameet, Abou-Amal, Othmane
We introduce Toto, a time series forecasting foundation model with 151 million parameters. Toto uses a modern decoder-only architecture coupled with architectural innovations designed to account for specific challenges found in multivariate observability time series data. Toto's pre-training corpus is a mixture of observability data, open datasets, and synthetic data, and is 4-10$\times$ larger than those of leading time series foundation models. Additionally, we introduce BOOM, a large-scale benchmark consisting of 350 million observations across 2,807 real-world time series. For both Toto and BOOM, we source observability data exclusively from Datadog's own telemetry and internal observability metrics. Extensive evaluations demonstrate that Toto achieves state-of-the-art performance on both BOOM and on established general purpose time series forecasting benchmarks. Toto's model weights, inference code, and evaluation scripts, as well as BOOM's data and evaluation code, are all available as open source under the Apache 2.0 License available at https://huggingface.co/Datadog/Toto-Open-Base-1.0 and https://github.com/DataDog/toto.
- North America > Trinidad and Tobago > Trinidad > Arima > Arima (0.04)
- North America > United States > California > San Francisco County > San Francisco (0.04)
- North America > Mexico > Mexico City > Mexico City (0.04)
- (3 more...)
Hydra: Dual Exponentiated Memory for Multivariate Time Series Analysis
Meskin, Asal, Mirrokni, Alireza, Najar, Ali, Behrouz, Ali
In recent years, effectively modeling multivariate time series has gained significant popularity, mainly due to its wide range of applications, ranging from healthcare to financial markets and energy management. Transformers, MLPs, and linear models as the de facto backbones of modern time series models have shown promising results in single-variant and/or short-term forecasting. These models, however: (1) are permutation equivariant and so lack temporal inductive bias, being less expressive to capture the temporal dynamics; (2) are naturally designed for univariate setup, missing the inter-dependencies of temporal and variate dimensions; and/or (3) are inefficient for Long-term time series modeling. To overcome training and inference efficiency as well as the lack of temporal inductive bias, recently, linear Recurrent Neural Networks (RNNs) have gained attention as an alternative to Transformer-based models. These models, however, are inherently limited to a single sequence, missing inter-variate dependencies, and can propagate errors due to their additive nature. In this paper, we present Hydra, a by-design two-headed meta in-context memory module that learns how to memorize patterns at test time by prioritizing time series patterns that are more informative about the data. Hydra uses a 2-dimensional recurrence across both time and variate at each step, which is more powerful than mixing methods. Although the 2-dimensional nature of the model makes its training recurrent and non-parallelizable, we present a new 2D-chunk-wise training algorithm that approximates the actual recurrence with $\times 10$ efficiency improvement, while maintaining the effectiveness. Our experimental results on a diverse set of tasks and datasets, including time series forecasting, classification, and anomaly detection show the superior performance of Hydra compared to state-of-the-art baselines.
- North America > United States (0.14)
- North America > Trinidad and Tobago > Trinidad > Arima > Arima (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Banking & Finance (0.54)
- Health & Medicine (0.48)
- North America > United States (0.28)
- North America > Trinidad and Tobago > Trinidad > Arima > Arima (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- (2 more...)
- Health & Medicine (1.00)
- Government (0.67)
- Information Technology > Data Science (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
- Information Technology > Artificial Intelligence > Natural Language (0.93)
- North America > United States > Massachusetts > Hampshire County > Amherst (0.04)
- Asia > Middle East > Jordan (0.04)
- North America > United States > New York (0.04)
- North America > Canada (0.04)
TelecomTS: A Multi-Modal Observability Dataset for Time Series and Language Analysis
Feng, Austin, Varvarigos, Andreas, Panitsas, Ioannis, Fernandez, Daniela, Wei, Jinbiao, Guo, Yuwei, Chen, Jialin, Maatouk, Ali, Tassiulas, Leandros, Ying, Rex
Modern enterprises generate vast streams of time series metrics when monitoring complex systems, known as observability data. Unlike conventional time series from domains such as weather, observability data are zero-inflated, highly stochastic, and exhibit minimal temporal structure. Despite their importance, observability datasets are underrepresented in public benchmarks due to proprietary restrictions. Existing datasets are often anonymized and normalized, removing scale information and limiting their use for tasks beyond forecasting, such as anomaly detection, root-cause analysis, and multi-modal reasoning. To address this gap, we introduce TelecomTS, a large-scale observability dataset derived from a 5G telecommunications network. TelecomTS features heterogeneous, de-anonymized covariates with explicit scale information and supports a suite of downstream tasks, including anomaly detection, root-cause analysis, and a question-answering benchmark requiring multi-modal reasoning. Benchmarking state-of-the-art time series, language, and reasoning models reveals that existing approaches struggle with the abrupt, noisy, and high-variance dynamics of observability data. Our experiments also underscore the importance of preserving covariates' absolute scale, emphasizing the need for foundation time series models that natively leverage scale information for practical observability applications.
- North America > United States (0.14)
- Asia > Singapore (0.04)
- Asia > Myanmar > Tanintharyi Region > Dawei (0.04)
- Telecommunications > Networks (1.00)
- Information Technology > Networks (1.00)
- Health & Medicine (1.00)
- Information Technology > Security & Privacy (0.67)