Compositional Bias Control in Large Language Models: Preference Learning Fails, Supervision Succeeds

Open in new window