SeerAttention: Self-distilled Attention Gating for Efficient Long-context Prefilling

Open in new window