AlignDistil: Token-Level Language Model Alignment as Adaptive Policy Distillation

Open in new window