ShadowLLM: Predictor-based Contextual Sparsity for Large Language Models

Open in new window