Just Enough Thinking: Efficient Reasoning with Adaptive Length Penalties Reinforcement Learning

Open in new window