SSPO: Self-traced Step-wise Preference Optimization for Process Supervision and Reasoning Compression

Open in new window