Tempo: AcceleratingTransformer-BasedModel TrainingthroughMemoryFootprintReduction
–Neural Information Processing Systems
Transformer-based models, which have recently seen a surge in popularity due to their good performance and applicability to a variety of tasks, have a similar problem.
Neural Information Processing Systems
Feb-8-2026, 21:59:20 GMT