Smoothing DiLoCo with Primal Averaging for Faster Training of LLMs