HelpSteer3-Preference: Open Human-Annotated Preference Data across Diverse Tasks and Languages
Wang, Zhilin, Zeng, Jiaqi, Delalleau, Olivier, Shin, Hoo-Chang, Soares, Felipe, Bukharin, Alexander, Evans, Ellie, Dong, Yi, Kuchaiev, Oleksii
–arXiv.org Artificial Intelligence
Preference datasets are essential for training general-domain, instruction-following language models with Reinforcement Learning from Human Feedback (RLHF). Each subsequent data release raises expectations for future data collection, meaning there is a constant need to advance the quality and diversity of openly available preference data. To address this need, we introduce HelpSteer3-Preference, a permissively licensed (CC-BY-4.0), high-quality, human-annotated preference dataset comprising of over 40,000 samples. These samples span diverse real-world applications of large language models (LLMs), including tasks relating to STEM, coding and multilingual scenarios. Using HelpSteer3-Preference, we train Reward Models (RMs) that achieve top performance on RM-Bench (82.4%) and JudgeBench (73.7%). This represents a substantial improvement (~10% absolute) over the previously best-reported results from existing RMs. We demonstrate HelpSteer3-Preference can also be applied to train Generative RMs and how policy models can be aligned with RLHF using our RMs. Dataset (CC-BY-4.0): https://huggingface.co/datasets/nvidia/HelpSteer3#preference Models (NVIDIA Open Model): https://huggingface.co/collections/nvidia/reward-models-68377c5955575f71fcc7a2a3
arXiv.org Artificial Intelligence
Oct-27-2025
- Country:
- Africa > Mozambique
- Gaza Province > Xai-Xai (0.04)
- Asia > Middle East
- Jordan (0.04)
- Saudi Arabia > Asir Province
- Abha (0.04)
- Europe
- Germany > Bavaria
- Upper Bavaria > Munich (0.04)
- Ireland > Leinster
- County Dublin > Dublin (0.04)
- Italy > Calabria
- Catanzaro Province > Catanzaro (0.04)
- Spain (0.04)
- United Kingdom > England (0.04)
- Germany > Bavaria
- North America
- Mexico > Mexico City
- Mexico City (0.04)
- United States
- Florida > Miami-Dade County
- Miami (0.04)
- Hawaii (0.04)
- Virginia (0.04)
- Florida > Miami-Dade County
- Mexico > Mexico City
- Pacific Ocean > North Pacific Ocean
- San Francisco Bay > Golden Gate (0.04)
- South America > Colombia
- Meta Department > Villavicencio (0.04)
- Africa > Mozambique
- Genre:
- Research Report > Experimental Study (1.00)
- Industry:
- Education > Educational Setting
- K-12 Education (0.45)
- Information Technology (1.00)
- Leisure & Entertainment (1.00)
- Media > Music (1.00)
- Education > Educational Setting
- Technology: