Accelerating Inference of Retrieval-Augmented Generation via Sparse Context Selection

Open in new window