Learning to Reason via Self-Iterative Process Feedback for Small Language Models

Open in new window