Round Attention: A Novel Round-Level Attention Mechanism to Accelerate LLM Inference

Open in new window