Flex Attention: A Programming Model for Generating Optimized Attention Kernels

Open in new window