A Unified Understanding of Offline Data Selection and Online Self-refining Generation for Post-training LLMs

Open in new window