Energon: Towards Efficient Acceleration of Transformers Using Dynamic Sparse Attention

Open in new window