CAST: Continuous and Differentiable Semi-Structured Sparsity-Aware Training for Large Language Models