Imbalanced Gradients in RL Post-Training of Multi-Task LLMs