Group Robust Preference Optimization in Reward-free RLHF

Oct-10-2025, 00:29:06 GMT–Neural Information Processing Systems

While these data often come from diverse labelers' groups (e.g., different demographics, ethnicities, company teams, etc.), traditional RLHF approaches

arxiv preprint arxiv, equation, experiment, (13 more...)

Neural Information Processing Systems

Oct-10-2025, 00:29:06 GMT

Conferences PDF

Country:
- Europe > Switzerland
  - Zürich > Zürich (0.04)
- Asia
  - Japan (0.04)
  - India (0.04)
  - China (0.04)
- Africa
  - Nigeria (0.04)
  - Middle East > Egypt (0.04)

Genre:
- Research Report > Experimental Study (0.93)

Technology:
- Information Technology > Artificial Intelligence
  - Representation & Reasoning (1.00)
  - Natural Language (1.00)
  - Machine Learning
    - Statistical Learning (1.00)
    - Neural Networks > Deep Learning (0.86)

Duplicate Docs Excel Report

Title
4147dfaa46cd7e20a2aecb91097ae8cc-Paper-Conference.pdf

Similar Docs Excel Report more

Title	Similarity	Source
None found