tsmixer
TTF: A Trapezoidal Temporal Fusion Framework for LTV Forecasting in Douyin
Wan, Yibing, Guan, Zhengxiong, Zhang, Chaoli, Li, Xiaoyang, Xu, Lai, Jia, Beibei, Zheng, Zhenzhe, Wu, Fan
In the user growth scenario, Internet companies invest heavily in paid acquisition channels to acquire new users. But sustainable growth depends on acquired users' generating lifetime value (LTV) exceeding customer acquisition cost (CAC). In order to maximize LTV/CAC ratio, it is crucial to predict channel-level LTV in an early stage for further optimization of budget allocation. The LTV forecasting problem is significantly different from traditional time series forecasting problems, and there are three main challenges. Firstly, it is an unaligned multi-time series forecasting problem that each channel has a number of LTV series of different activation dates. Secondly, to predict in the early stage, it faces the imbalanced short-input long-output (SILO) challenge. Moreover, compared with the commonly used time series datasets, the real LTV series are volatile and non-stationary, with more frequent fluctuations and higher variance. In this work, we propose a novel framework called Trapezoidal Temporal Fusion (TTF) to address the above challenges. We introduce a trapezoidal multi-time series module to deal with data unalignment and SILO challenges, and output accurate predictions with a multi-tower structure called MT-FusionNet. The framework has been deployed to the online system for Douyin. Compared to the previously deployed online model, MAPEp decreased by 4.3%, and MAPEa decreased by 3.2%, where MAPEp denotes the point-wise MAPE of the LTV curve and MAPEa denotes the MAPE of the aggregated LTV.
Accelerating Long-Term Molecular Dynamics with Physics-Informed Time-Series Forecasting
Le, Hung, Abbas, Sherif, Nguyen, Minh Hoang, Do, Van Dai, Nguyen, Huu Hiep, Nguyen, Dung
Efficient molecular dynamics (MD) simulation is vital for understanding atomic-scale processes in materials science and biophysics. Traditional density functional theory (DFT) methods are computationally expensive, which limits the feasibility of long-term simulations. We propose a novel approach that formulates MD simulation as a time-series forecasting problem, enabling advanced forecasting models to predict atomic trajectories via displacements rather than absolute positions. We incorporate a physics-informed loss and inference mechanism based on DFT-parametrised pair-wise Morse potential functions that penalize unphysical atomic proximity to enforce physical plausibility. Our method consistently surpasses standard baselines in simulation accuracy across diverse materials. The results highlight the importance of incorporating physics knowledge to enhance the reliability and precision of atomic trajectory forecasting. Remarkably, it enables stable modeling of thousands of MD steps in minutes, offering a scalable alternative to costly DFT simulations.
TSKANMixer: Kolmogorov-Arnold Networks with MLP-Mixer Model for Time Series Forecasting
Hong, Young-Chae, Xiao, Bei, Chen, Yangho
Time series forecasting has long been a focus of research across diverse fields, including economics, energy, healthcare, and traffic management. Recent works have introduced innovative architectures for time series models, such as the Time-Series Mixer (TSMixer), which leverages multi-layer perceptrons (MLPs) to enhance prediction accuracy by effectively capturing both spatial and temporal dependencies within the data. In this paper, we investigate the capabilities of the Kolmogorov-Arnold Networks (KANs) for time-series forecasting by modifying TSMixer with a KAN layer (TSKANMixer). Experimental results demonstrate that TSKANMixer tends to improve prediction accuracy over the original TSMixer across multiple datasets, ranking among the top-performing models compared to other time series approaches. Our results show that the KANs are promising alternatives to improve the performance of time series forecasting by replacing or extending traditional MLPs.
- Asia (0.46)
- Oceania (0.14)
- North America > United States (0.14)
- Europe > United Kingdom (0.14)
- Health & Medicine (1.00)
- Banking & Finance (0.93)
- Energy > Renewable (0.46)
- Energy > Power Industry (0.46)
FSMLP: Modelling Channel Dependencies With Simplex Theory Based Multi-Layer Perceptions In Frequency Domain
Li, Zhengnan, Li, Haoxuan, Wang, Hao, Fang, Jun, Qin, Duoyin Li Yunxiao
Time series forecasting (TSF) plays a crucial role in various domains, including web data analysis, energy consumption prediction, and weather forecasting. While Multi-Layer Perceptrons (MLPs) are lightweight and effective for capturing temporal dependencies, they are prone to overfitting when used to model inter-channel dependencies. In this paper, we investigate the overfitting problem in channel-wise MLPs using Rademacher complexity theory, revealing that extreme values in time series data exacerbate this issue. To mitigate this issue, we introduce a novel Simplex-MLP layer, where the weights are constrained within a standard simplex. This strategy encourages the model to learn simpler patterns and thereby reducing overfitting to extreme values. Based on the Simplex-MLP layer, we propose a novel \textbf{F}requency \textbf{S}implex \textbf{MLP} (FSMLP) framework for time series forecasting, comprising of two kinds of modules: \textbf{S}implex \textbf{C}hannel-\textbf{W}ise MLP (SCWM) and \textbf{F}requency \textbf{T}emporal \textbf{M}LP (FTM). The SCWM effectively leverages the Simplex-MLP to capture inter-channel dependencies, while the FTM is a simple yet efficient temporal MLP designed to extract temporal information from the data. Our theoretical analysis shows that the upper bound of the Rademacher Complexity for Simplex-MLP is lower than that for standard MLPs. Moreover, we validate our proposed method on seven benchmark datasets, demonstrating significant improvements in forecasting accuracy and efficiency, while also showcasing superior scalability. Additionally, we demonstrate that Simplex-MLP can improve other methods that use channel-wise MLP to achieve less overfitting and improved performance. Code are available \href{https://github.com/FMLYD/FSMLP}{\textcolor{red}{here}}.
Enhancing Exchange Rate Forecasting with Explainable Deep Learning Models
Meng, Shuchen, Chen, Andi, Wang, Chihang, Zheng, Mengyao, Wu, Fangyu, Chen, Xupeng, Ni, Haowei, Li, Panfeng
Accurate exchange rate prediction is fundamental to financial stability and international trade, positioning it as a critical focus in economic and financial research. Traditional forecasting models often falter when addressing the inherent complexities and non-linearities of exchange rate data. This study explores the application of advanced deep learning models, including LSTM, CNN, and transformer-based architectures, to enhance the predictive accuracy of the RMB/USD exchange rate. Utilizing 40 features across 6 categories, the analysis identifies TSMixer as the most effective model for this task. A rigorous feature selection process emphasizes the inclusion of key economic indicators, such as China-U.S. trade volumes and exchange rates of other major currencies like the euro-RMB and yen-dollar pairs. The integration of grad-CAM visualization techniques further enhances model interpretability, allowing for clearer identification of the most influential features and bolstering the credibility of the predictions. These findings underscore the pivotal role of fundamental economic data in exchange rate forecasting and highlight the substantial potential of machine learning models to deliver more accurate and reliable predictions, thereby serving as a valuable tool for financial analysis and decision-making.
- North America > United States > New York (0.05)
- Asia > China > Beijing > Beijing (0.04)
- North America > United States > Michigan > Washtenaw County > Ann Arbor (0.04)
- Government (1.00)
- Banking & Finance > Trading (1.00)
- Banking & Finance > Economy (1.00)
Unlocking the Potential of Transformers in Time Series Forecasting with Sharpness-Aware Minimization and Channel-Wise Attention
Ilbert, Romain, Odonnat, Ambroise, Feofanov, Vasilii, Virmaux, Aladin, Paolo, Giuseppe, Palpanas, Themis, Redko, Ievgen
Transformer-based architectures achieved breakthrough performance in natural language processing and computer vision, yet they remain inferior to simpler linear baselines in multivariate long-term forecasting. To better understand this phenomenon, we start by studying a toy linear forecasting problem for which we show that transformers are incapable of converging to their true solution despite their high expressive power. We further identify the attention of transformers as being responsible for this low generalization capacity. Building upon this insight, we propose a shallow lightweight transformer model that successfully escapes bad local minima when optimized with sharpness-aware optimization. We empirically demonstrate that this result extends to all commonly used real-world multivariate time series datasets. In particular, SAMformer surpasses the current state-of-the-art model TSMixer by 14.33% on average, while having ~4 times fewer parameters. The code is available at https://github.com/romilbert/samformer.
- North America > United States > California (0.04)
- North America > United States > New York > New York County > New York City (0.04)
- North America > Trinidad and Tobago > Trinidad > Arima > Arima (0.04)
- (4 more...)
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (0.67)
- Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
- Information Technology > Artificial Intelligence > Natural Language (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
TSMixer: Lightweight MLP-Mixer Model for Multivariate Time Series Forecasting
Ekambaram, Vijay, Jati, Arindam, Nguyen, Nam, Sinthong, Phanwadee, Kalagnanam, Jayant
Transformers have gained popularity in time series forecasting for their ability to capture long-sequence interactions. However, their high memory and computing requirements pose a critical bottleneck for long-term forecasting. To address this, we propose TSMixer, a lightweight neural architecture exclusively composed of multi-layer perceptron (MLP) modules for multivariate forecasting and representation learning on patched time series. Inspired by MLP-Mixer's success in computer vision, we adapt it for time series, addressing challenges and introducing validated components for enhanced accuracy. This includes a novel design paradigm of attaching online reconciliation heads to the MLP-Mixer backbone, for explicitly modeling the time-series properties such as hierarchy and channel-correlations. We also propose a novel Hybrid channel modeling and infusion of a simple gating approach to effectively handle noisy channel interactions and generalization across diverse datasets. By incorporating these lightweight components, we significantly enhance the learning capability of simple MLP structures, outperforming complex Transformer models with minimal computing usage. Moreover, TSMixer's modular design enables compatibility with both supervised and masked self-supervised learning methods, making it a promising building block for time-series Foundation Models. TSMixer outperforms state-of-the-art MLP and Transformer models in forecasting by a considerable margin of 8-60%. It also outperforms the latest strong benchmarks of Patch-Transformer models (by 1-2%) with a significant reduction in memory and runtime (2-3X). The source code of our model is officially released as PatchTSMixer in the HuggingFace. Model: https://huggingface.co/docs/transformers/main/en/model_doc/patchtsmixer Examples: https://github.com/ibm/tsfm/#notebooks-links
- North America > United States > California > Los Angeles County > Long Beach (0.05)
- Asia > India > Karnataka > Bengaluru (0.04)
- Oceania > Australia > Victoria > Melbourne (0.04)
- (4 more...)
AutoMixer for Improved Multivariate Time-Series Forecasting on Business and IT Observability Data
Palaskar, Santosh, Ekambaram, Vijay, Jati, Arindam, Gantayat, Neelamadhav, Saha, Avirup, Nagar, Seema, Nguyen, Nam H., Dayama, Pankaj, Sindhgatta, Renuka, Mohapatra, Prateeti, Kumar, Harshit, Kalagnanam, Jayant, Hemachandra, Nandyala, Rangaraj, Narayan
The efficiency of business processes relies on business key performance indicators (Biz-KPIs), that can be negatively impacted by IT failures. Business and IT Observability (BizITObs) data fuses both Biz-KPIs and IT event channels together as multivariate time series data. Forecasting Biz-KPIs in advance can enhance efficiency and revenue through proactive corrective measures. However, BizITObs data generally exhibit both useful and noisy inter-channel interactions between Biz-KPIs and IT events that need to be effectively decoupled. This leads to suboptimal forecasting performance when existing multivariate forecasting models are employed. To address this, we introduce AutoMixer, a time-series Foundation Model (FM) approach, grounded on the novel technique of channel-compressed pretrain and finetune workflows. AutoMixer leverages an AutoEncoder for channel-compressed pretraining and integrates it with the advanced TSMixer model for multivariate time series forecasting. This fusion greatly enhances the potency of TSMixer for accurate forecasts and also generalizes well across several downstream tasks. Through detailed experiments and dashboard analytics, we show AutoMixer's capability to consistently improve the Biz-KPI's forecasting accuracy (by 11-15\%) which directly translates to actionable business insights.
TSMixer: An All-MLP Architecture for Time Series Forecasting
Chen, Si-An, Li, Chun-Liang, Yoder, Nate, Arik, Sercan O., Pfister, Tomas
Real-world time-series datasets are often multivariate with complex dynamics. To capture this complexity, high capacity architectures like recurrent- or attention-based sequential deep learning models have become popular. However, recent work demonstrates that simple univariate linear models can outperform such deep learning models on several commonly used academic benchmarks. Extending them, in this paper, we investigate the capabilities of linear models for time-series forecasting and present Time-Series Mixer (TSMixer), a novel architecture designed by stacking multi-layer perceptrons (MLPs). TSMixer is based on mixing operations along both the time and feature dimensions to extract information efficiently. On popular academic benchmarks, the simple-to-implement TSMixer is comparable to specialized state-of-the-art models that leverage the inductive biases of specific benchmarks. On the challenging and large scale M5 benchmark, a real-world retail dataset, TSMixer demonstrates superior performance compared to the state-of-the-art alternatives. Our results underline the importance of efficiently utilizing cross-variate and auxiliary information for improving the performance of time series forecasting. We present various analyses to shed light into the capabilities of TSMixer. The design paradigms utilized in TSMixer are expected to open new horizons for deep learning-based time series forecasting. The implementation is available at https://github.com/google-research/google-research/tree/master/tsmixer
- North America > Trinidad and Tobago > Trinidad > Arima > Arima (0.04)
- North America > United States > Maryland > Baltimore (0.04)
- North America > United States > California > San Diego County > San Diego (0.04)
- (5 more...)