RRHF (1)
–Neural Information Processing Systems
RRHF can align with not only human preferences but also any preferences. As a large language model, Wombat has the possibility to generate unsafe responses. We also conduct experiments on the IMDB dataset for assessing positive movie reviews generation. The task expects the model to give positive and fluent movie review completions based on given partial review input texts. RRHF-OP-128 follows the bottommost workflow in Figure 2 in the main texts.
Neural Information Processing Systems
Feb-8-2026, 23:56:44 GMT
- Country:
- Oceania
- Australia > Tasmania (0.05)
- New Zealand (0.05)
- Oceania
- Industry:
- Leisure & Entertainment (0.56)
- Media > Film (0.56)
- Technology: