Enhancing Reasoning Capabilities in SLMs with Reward Guided Dataset Distillation

Open in new window