Training Optimal Large Diffusion Language Models