REA-RL: Reflection-Aware Online Reinforcement Learning for Efficient Large Reasoning Models

Open in new window