Deja Vu: Contextual Sparsity for Efficient LLMs at Inference Time

Open in new window