Sirius: Contextual Sparsity with Correction for Efficient LLMs

Open in new window