Efficient Continual Pre-training by Mitigating the Stability Gap

Open in new window