A Survey on Progress in LLM Alignment from the Perspective of Reward Design
Ji, Miaomiao, Wu, Yanqiu, Wu, Zhibin, Wang, Shoujin, Yang, Jian, Dras, Mark, Naseem, Usman
–arXiv.org Artificial Intelligence
Reward design plays a pivotal role in aligning large language models (LLMs) with human values, serving as the bridge between feedback signals and model optimization. This survey provides a structured organization of reward modeling and addresses three key aspects: mathematical formulation, construction practices, and interaction with optimization paradigms. Building on this, it develops a macro-level taxonomy that characterizes reward mechanisms along complementary dimensions, thereby offering both conceptual clarity and practical guidance for alignment research. The progression of LLM alignment can be understood as a continuous refinement of reward design strategies, with recent developments highlighting paradigm shifts from reinforcement learning (RL)-based to RL-free optimization and from single-task to multi-objective and complex settings.
arXiv.org Artificial Intelligence
Sep-3-2025