Gain Tuning Is Not What You Need: Reward Gain Adaptation for Constrained Locomotion Learning