Fourier-Mixed Window Attention: Accelerating Informer for Long Sequence Time-Series Forecasting
–arXiv.org Artificial Intelligence
Recent progress in long sequence time-series forecasting (LSTF) has been led by either transformers with sparse attention ([16] and references therein) or attention in combination with signal preprocessing such as seasonal-trend decomposition [17] or adopting auto-correlation to account for periodicity in the data [13]. On the other hand, Fourier transform has been proposed as an alternative mixing tool in lieu of standard attention [12] to speed up prediction in natural language processing (NLP) tasks (FNet, [2]). Though Fourier transform is meant to mimic the mixing functions of multilayer perceptron(MLP,[11]), it is not well-understood why it works and when assistance from attention layers remain necessary to maintain performance. In computer vision (CV), Fourier transform is also used as a filtering step in early stages of transformer (GFNet,[8]) to enhance a fully attention-based architecture. A recent advance in CV is to adopt window attention to reduce quadratic complexity of full attention [12].
arXiv.org Artificial Intelligence
Jul-2-2023
- Country:
- North America > United States
- California > Orange County > Irvine (0.14)
- Europe > Italy
- Calabria > Catanzaro Province > Catanzaro (0.04)
- North America > United States
- Genre:
- Research Report (0.50)
- Technology: