Angles Don't Lie: Unlocking Training‑Efficient RL Through the Model's Own Signals

Open in new window