Generative Adversarial Post-Training Mitigates Reward Hacking in Live Human-AI Music Interaction