Aligning LLM Agents by Learning Latent Preference from User Edits

Mar-22-2026, 21:45:36 GMT–Neural Information Processing Systems

We study interactive learning of language agents based on user edits made to the agent's output. In a typical setting such as writing assistants, the user interacts with a language agent to generate a response given a context, and may optionally edit the agent response to personalize it based on their latent preference, in addition to improving the correctness. The edit feedback is naturally generated, making it a suitable candidate for improving the agent's alignment with the user's preference, and for reducing the cost of user edits over time. We propose a learning framework, PRELUDE that infers a description of the user's latent preference based on historic edit data and using it to define a prompt policy that drives future response generation. This avoids fine-tuning the agent, which is costly, challenging to scale with the number of users, and may even degrade its performance on other tasks.

large language model, machine learning, natural language, (13 more...)

Neural Information Processing Systems

Mar-22-2026, 21:45:36 GMT

Conferences Web Page

Add feedback

Technology:
- Information Technology > Artificial Intelligence
  - Machine Learning (0.54)
  - Representation & Reasoning (0.42)
  - Natural Language > Large Language Model (0.36)