Stabilizing Reinforcement Learning in Differentiable Multiphysics Simulation