Inference-Time Personalized Alignment with a Few User Preference Queries

Jun-18-2026, 19:56:23 GMT–Neural Information Processing Systems

We study the problem of aligning a generative model's response with a user's preferences. Recent works have proposed several different formulations for personalized alignment; however, they either require a large amount of user preference queries or require that the preference be explicitly specified as a text input. In this paper, we propose a novel inference-time personalized alignment method, USERALIGN, that elicits the user's preferences with a few queries as pairwise response comparisons. In particular, USERALIGN builds on the theoretical framework of best-arm identification in logistic bandits and selects a personalized response from a fixed pool of the model's generated responses. The key idea is to consider the user's feedback consistent and noise-free, and incorporate it into the theoretical framework to identify the best response quickly.

large language model, machine learning, useralign, (19 more...)

Neural Information Processing Systems

Jun-18-2026, 19:56:23 GMT

Conferences PDF

Add feedback

Country:
- Europe (0.28)

Genre:
- Research Report
  - New Finding (1.00)
  - Experimental Study (1.00)

Technology:
- Information Technology > Artificial Intelligence
  - Representation & Reasoning > Personal Assistant Systems (0.71)
  - Natural Language > Large Language Model (0.48)
  - Machine Learning > Neural Networks
    - Deep Learning (0.48)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found