SwS: Self-aware Weakness-driven Problem Synthesis in Reinforcement Learning for LLMReasoning

Open in new window