Learning From Correctness Without Prompting Makes LLM Efficient Reasoner