Compositional Bias Control in Large Language Models: Preference Learning Fails, Supervision Succeeds