Towards Active Synthetic Data Generation for Finetuning Language Models

Kessler, Samuel, Xia, Menglin, Diaz, Daniel Madrigal, Han, Dongge, Heshemi, Helia, Rajmohan, Saravan, Ruehle, Victor, Ash, Jordan T.

arXiv.org Artificial Intelligence 

Large Language Models (LLMs) have shown remarkable abilities in a wide variety of reasoning and factual knowledge tasks (Achiam et al., 2023; Bubeck et al., 2023; Katz et al., 2024), but their large size makes inference expensive. With the advent of agentic systems that interact with the external world, LLMs are poised to become even more ubiquitous in science, technology, and society, but the tremendous inference cost presents a challenge for realizing the full potential of these agents. One way to quell the computational expense associated with LLM inference is to use small language models (SLMs). With orders of magnitude fewer parameters, SLMs are faster, cheaper, and easier to finetune for specialised skills like tool use, making them natural specialists using proprietary data or within agentic systems (Belcak et al., 2025). Training language models typically involves three stages: pre-training on large general-purpose corpora, supervised finetuning (SFT), and reinforcement learning from human feedback (RLHF) or from verifiable rewards (RLVR) (Ouyang et al., 2022).

Duplicate Docs Excel Report

Title
None found

Similar Docs  Excel Report  more

TitleSimilaritySource
None found