ContraSolver: Self-Alignment of Language Models by Resolving Internal Preference Contradictions

Open in new window