Superficial Self-Improved Reasoners Benefit from Model Merging

Open in new window