Eliciting User Preferences for Personalized Multi-Objective Decision Making through Comparative Feedback

May-28-2025, 16:58:21 GMT–Neural Information Processing Systems

In this work, we propose a multi-objective decision making framework that accommodates different user preferences over objectives, where preferences are learned via policy comparisons. Our model consists of a known Markov decision process with a vector-valued reward function, with each user having an unknown preference vector that expresses the relative importance of each objective. The goal is to efficiently compute a near-optimal policy for a given user. We consider two user feedback models. We first address the case where a user is provided with two policies and returns their preferred policy as feedback. We then move to a different user feedback model, where a user is instead provided with two small weighted sets of representative trajectories and selects the preferred one. In both cases, we suggest an algorithm that finds a nearly optimal policy for the user using a number of comparison queries that scales quasilinearly in the number of objectives.

machine learning, reinforcement learning, trajectory, (19 more...)

Neural Information Processing Systems

May-28-2025, 16:58:21 GMT

Conferences PDF

Add feedback

Country:
- Asia > Middle East
  - Israel (0.14)
- North America > United States (0.45)

Industry:
- Government (0.46)

Technology:
- Information Technology > Artificial Intelligence
  - Machine Learning > Reinforcement Learning (0.93)
  - Representation & Reasoning > Personal Assistant Systems (0.70)

Duplicate Docs Excel Report

Title
Eliciting User Preferences for Personalized Multi-Objective Decision Making through Comparative Feedback
Eliciting User Preferences for Personalized Multi-Objective Decision Making through Comparative Feedback
Eliciting User Preferences for Personalized Multi-Objective Decision Making through Comparative Feedback

Similar Docs Excel Report more

Title	Similarity	Source
None found