Token-Level Self-Play with Importance-Aware Guidance for Large Language Models

Open in new window