Improving Generalization of Alignment with Human Preferences through Group Invariant Learning

Open in new window