DP-Dueling: Learning from Preference Feedback without Compromising User Privacy

Mar-22-2024–arXiv.org Artificial Intelligence

Research has indicated that it is often more convenient, faster, and cost-effective to gather feedback in a relative manner rather than using absolute ratings [31, 40]. To illustrate, when assessing an individual's preference between two items, such as A and B, it is often easier for respondents to answer preference-oriented queries like "Which item do you prefer, A or B?" instead of requesting to rate items A and B on a scale ranging from 0 to 10. From the perspective of a system designer, leveraging this user preference data can significantly enhance system performance, especially when this data can be collected in a relative and online fashion. This applies to various real-world scenarios, including recommendation systems, crowd-sourcing platforms, training bots, multiplayer games, search engine optimization, online retail, and more. In many practical situations, particularly when human preferences are gathered online, such as designing surveys, expert reviews, product selection, search engine optimization, recommender systems, multiplayer game rankings, and even broader reinforcement learning problems with complex reward structures, it's often easier to elicit preference feedback instead of relying on absolute ratings or rewards. Because of its broad utility and the simplicity of gathering data using relative feedback, learning from preferences has become highly popular in the machine learning community. It has been extensively studied over the past decade under the name "Dueling-Bandits" (DB) in the literature. This framework is an extension of the traditional multi-armed bandit (MAB) setting, as described in [4]. In the DB framework, the goal is to identify a set of'good' options from a fixed decision

data mining, machine learning, reinforcement learning, (21 more...)

arXiv.org Artificial Intelligence

Mar-22-2024

arXiv.org PDF

Add feedback

Genre:
- Research Report (0.64)

Industry:
- Information Technology > Security & Privacy (0.67)
- Leisure & Entertainment > Games (0.54)

Technology:
- Information Technology
  - Artificial Intelligence
    - Machine Learning
      - Neural Networks > Deep Learning (0.46)
      - Reinforcement Learning (0.67)
    - Natural Language > Information Retrieval (0.68)
    - Representation & Reasoning > Personal Assistant Systems (0.88)
  - Data Science > Data Mining
    - Big Data (0.66)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found