SEMA: a Scalable and Efficient Mamba like Attention via Token Localization and Averaging

Open in new window