Gradient-Adaptive Policy Optimization: Towards Multi-Objective Alignment of Large Language Models

Open in new window