SPDF: Sparse Pre-training and Dense Fine-tuning for Large Language Models

Open in new window