Gradient Multi-Normalization for Stateless and Scalable LLM Training