CREST: Effectively Compacting a Datastore For Retrieval-Based Speculative Decoding

Open in new window