Learning From Correctness Without Prompting Makes LLM Efficient Reasoner

Open in new window