Personalizing Reinforcement Learning from Human Feedback with Variational Preference Learning

Mar-20-2026, 20:19:12 GMT–Neural Information Processing Systems

Reinforcement Learning from Human Feedback (RLHF) is a powerful paradigm for aligning foundation models to human values and preferences.

artificial intelligence, machine learning, proceedings, (8 more...)

Neural Information Processing Systems

Mar-20-2026, 20:19:12 GMT

Conferences Web Page

Technology:
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.71)