Training Language Models to Reason Efficiently