On the Transformations across Reward Model, Parameter Update, and In-Context Prompt
Cai, Deng, Li, Huayang, Fu, Tingchen, Li, Siheng, Xu, Weiwen, Li, Shuaiyi, Cao, Bowen, Zhang, Zhisong, Huang, Xinting, Cui, Leyang, Wang, Yan, Liu, Lemao, Watanabe, Taro, Shi, Shuming
–arXiv.org Artificial Intelligence
Despite the general capabilities of pre-trained large language models (LLMs), they still need further adaptation to better serve practical applications. In this paper, we demonstrate the interchangeability of three popular and distinct adaptation tools: parameter updating, reward modeling, and in-context prompting. This interchangeability establishes a triangular framework with six transformation directions, each of which facilitates a variety of applications. Our work offers a holistic view that unifies numerous existing studies and suggests potential research directions. We envision our work as a useful roadmap for future research on LLMs.
arXiv.org Artificial Intelligence
Jun-24-2024
- Country:
- Oceania > Australia
- North America
- Dominican Republic (0.04)
- United States
- Texas > Travis County
- Austin (0.04)
- Pennsylvania > Philadelphia County
- Philadelphia (0.04)
- California > Los Angeles County
- Long Beach (0.04)
- Texas > Travis County
- Canada
- Ontario > Toronto (0.04)
- British Columbia > Metro Vancouver Regional District
- Vancouver (0.04)
- Europe
- Germany > Berlin (0.04)
- France (0.04)
- Spain > Catalonia
- Barcelona Province > Barcelona (0.04)
- Romania > Sud - Muntenia Development Region
- Giurgiu County > Giurgiu (0.04)
- Italy > Calabria
- Catanzaro Province > Catanzaro (0.04)
- Croatia > Dubrovnik-Neretva County
- Dubrovnik (0.04)
- Asia
- Singapore (0.04)
- China > Hong Kong (0.04)
- Middle East > UAE
- Abu Dhabi Emirate > Abu Dhabi (0.04)
- Africa > Ethiopia
- Addis Ababa > Addis Ababa (0.04)
- Genre:
- Overview (0.68)
- Research Report (0.63)
- Technology: