Tool-Augmented Reward Modeling
Li, Lei, Chai, Yekun, Wang, Shuohuan, Sun, Yu, Tian, Hao, Zhang, Ningyu, Wu, Hua
–arXiv.org Artificial Intelligence
Reward modeling (a.k.a., preference modeling) is instrumental for aligning large language models with human preferences, particularly within the context of reinforcement learning from human feedback (RLHF). While conventional reward models (RMs) have exhibited remarkable scalability, they oft struggle with fundamental functionality such as arithmetic computation, code execution, and factual lookup. In this paper, we propose a tool-augmented preference modeling approach, named \name, to address these limitations by empowering RMs with access to external environments, including calculators and search engines. This approach not only fosters synergy between tool utilization and reward grading but also enhances interpretive capacity and scoring reliability. Our study delves into the integration of external tools into RMs, enabling them to interact with diverse external sources and construct task-specific tool engagement and reasoning traces in an autoregressive manner. We validate our approach across a wide range of domains, incorporating seven distinct external tools. Our experimental results demonstrate a noteworthy overall improvement of 17.7% across eight tasks in preference ranking. Furthermore, our approach outperforms Gopher 280B by 7.3% on TruthfulQA task in zero-shot evaluation. In human evaluations, RLHF trained with Themis attains an average win rate of 32% when compared to baselines across four distinct tasks. Additionally, we provide a comprehensive collection of tool-related RM datasets, incorporating data from seven distinct tool APIs, totaling 15,000 instances. We anticipate that this publicly available dataset will facilitate and inspire further research advancements in the field.
arXiv.org Artificial Intelligence
Oct-2-2023
- Country:
- South America > Colombia
- Meta Department > Villavicencio (0.04)
- North America
- Mexico (0.04)
- United States
- Rocky Mountains (0.04)
- Nevada (0.04)
- New York
- Richmond County > New York City (0.04)
- Queens County > New York City (0.04)
- New York County > New York City (0.04)
- Kings County > New York City (0.04)
- Bronx County > New York City (0.04)
- Minnesota > Hennepin County
- Minneapolis (0.14)
- Hawaii > Honolulu County
- Honolulu (0.04)
- California > Los Angeles County
- Long Beach (0.04)
- Glendale (0.04)
- Canada
- Rocky Mountains (0.04)
- Ontario > Toronto (0.04)
- Alberta > Census Division No. 15
- Improvement District No. 9 > Banff (0.04)
- Europe
- Germany (0.04)
- Spain > Catalonia
- Barcelona Province > Barcelona (0.04)
- Ireland > Leinster
- County Dublin > Dublin (0.04)
- Asia
- Middle East
- Jordan (0.04)
- Yemen > Amanat Al Asimah
- Sanaa (0.04)
- UAE > Abu Dhabi Emirate
- Abu Dhabi (0.04)
- Syria > Aleppo Governorate
- Aleppo (0.04)
- India > Karnataka
- Bengaluru (0.04)
- China > Beijing
- Beijing (0.04)
- Middle East
- Africa > Rwanda
- South America > Colombia
- Genre:
- Research Report > New Finding (1.00)
- Industry:
- Government (0.68)
- Health & Medicine (0.46)
- Transportation > Ground (0.46)
- Technology: