Tactic: Adaptive Sparse Attention with Clustering and Distribution Fitting for Long-Context LLMs

Open in new window