Generative Adversarial Post-Training Mitigates Reward Hacking in Live Human-AI Music Interaction

Open in new window