DistrAttention: An Efficient and Flexible Self-Attention Mechanism on Modern GPUs

Open in new window