Goto

Collaborating Authors

 Energy



Introducing Spectral Attention for Long-Range Dependency in Time Series Forecasting

Neural Information Processing Systems

Spectral Attention preserves long-period trends through a low-pass filter and facilitates gradient to flow between samples. Spectral Attention can be seamlessly integrated into most sequence models, allowing models with fixed-sized look-back windows to capture long-range dependencies over thousands of steps.



A Practitioner's Guide to Continual Multimodal Pretraining

Neural Information Processing Systems

However, practical model deployment often operates in the gap between these two limit cases, as real-world applications demand adaptation to specific subdomains, tasks or concepts -- spread over the entire, varying life cycle of a model.