Pre-Trained Policy Discriminators are General Reward Models

Jun-23-2026, 03:36:28 GMT–Neural Information Processing Systems

We offer a novel perspective on reward modeling by formulating it as a policy discriminator, which quantifies the difference between two policies to generate a reward signal, guiding the training policy towards a target policy with desired behaviors. Based on this conceptual insight, we propose a scalable pre-training method named POLicy DiscriminAtive LeaRning (POLAR), which trains a reward model (RM) to discern identical policies and discriminate different ones. Unlike traditional reward modeling methods relying on absolute preferences, POLAR captures the relative difference between one policy and an arbitrary target policy, which is a scalable, high-level optimization objective suitable for modeling generic ranking relationships. Leveraging the POLAR pre-training paradigm, we present a series of RMs with parameter scales from 1.8B to 7B. Empirical results show that POLAR substantially outperforms traditional non-pre-trained methods, significantly enhancing RM performance. For instance, POLAR-7B could improve preference accuracy from 54.8% to 81.0% on STEM tasks and from 57.9% to 85.5% on creative writing tasks compared to SOTA baselines. POLAR also shows robust generalization capabilities in RLHF using Reinforcement Finetuning (RFT), providing reliable reward signals and markedly enhancing policy performance--improving LLaMa3.1-8B

arxiv preprint arxiv, large language model, machine learning, (20 more...)

Neural Information Processing Systems

Jun-23-2026, 03:36:28 GMT

Conferences PDF

Add feedback

Country:
- Asia (0.67)
- North America > United States
  - Minnesota (0.27)

Genre:
- Overview (0.92)
- Research Report
  - Experimental Study (1.00)
  - New Finding (0.87)

Industry:
- Education (0.46)
- Media (0.45)
- Leisure & Entertainment (0.45)

Technology:
- Information Technology > Artificial Intelligence
  - Natural Language
    - Large Language Model (1.00)
    - Chatbot (1.00)
  - Machine Learning > Neural Networks
    - Deep Learning (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found