READER: Retrieval-Assisted Drafter for Efficient LLM Inference

Open in new window