On the Transformations across Reward Model, Parameter Update, and In-Context Prompt

Open in new window