Reinforcement Learning from Human Feedback: Whose Culture, Whose Values, Whose Perspectives?

Barman, Kristian González, Lohse, Simon, de Regt, Henk

arXiv.org Artificial Intelligence 

This approach is partic ularly useful when designing AI systems for tasks where it is difficult to specify a precise reward function or when it is important to align the model's behaviour with certain human expectations and values. For instance, RLHF has notably improved language models for context - aware text generation (Ziegler et al. 2020) and taught robots to navigate cluttered environments (Henry et al. 2010) . RLHF is commonly employed in the later stages of fine - tuning models, particularly in the development of prominent Large Language Models (LLMs) like GPT - 3.5 or GPT - 4. Initially, these models undergo training using vast text corpora to grasp a broad range of language patterns and contexts. This foundational training is supplemented by task - specific fine - tuning, where the models are adjusted to excel in particular applications, such as understanding and generating dialogues. The refinement process is then furt her enhanced through RLHF.

Duplicate Docs Excel Report

Title
None found

Similar Docs  Excel Report  more

TitleSimilaritySource
None found