Spike No More: Stabilizing the Pre-training of Large Language Models