Star Attention: Efficient LLM Inference over Long Sequences

Open in new window