Trainable Dynamic Mask Sparse Attention