Learning Goal-Conditioned Representations for Language Reward Models

Neural Information Processing Systems 

Nevertheless, it is unclear how improved representation learning can benefit reinforcement learning from human feedback on language models.

Similar Docs  Excel Report  more

TitleSimilaritySource
None found