RLCD: Reinforcement Learning from Contrast Distillation for Language Model Alignment

Open in new window