Preference-Guided Reflective Sampling for Aligning Language Models

Open in new window