Principled Fine-tuning of LLMs from User-Edits: AMedley of Preference, Supervision, and Reward

Jun-23-2026, 03:21:23 GMT–Neural Information Processing Systems

We study how to fine-tune LLMs using user-edit deployment data consisting of a set of context, an agent's response, and user edits. This deployment data is naturally generated by users in applications such as LLMs-based writing assistants and coding agents. The natural origin of user edits makes it a desired source for adapting and personalizing of LLMs. In this setup, there emerges a unification of various feedback types namely preferences, supervised labels, and cost that are typically studied separately in the literature. In this paper, we initiate the theoretical investigation of learning from user edits.

large language model, machine learning, natural language, (21 more...)

Neural Information Processing Systems

Jun-23-2026, 03:21:23 GMT

Conferences PDF

Add feedback

Genre:
- Research Report > Experimental Study (1.00)
- Overview (0.67)

Industry:
- Education (0.68)
- Information Technology (0.45)

Technology:
- Information Technology > Artificial Intelligence
  - Natural Language > Large Language Model (1.00)
  - Machine Learning > Neural Networks
    - Deep Learning (0.93)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found