Panacea: Pareto Alignment via Preference Adaptation for LLMs

May-27-2025, 08:03:12 GMT–Neural Information Processing Systems

However, this convention tends to oversimplify the multi-dimensional and heterogeneous nature of human preferences, leading to reduced expressivity and even misalignment. This paper presents Panacea, an innovative approach that reframes alignment as a multi-dimensional preference optimization problem. Panacea trains a single model capable of adapting online and Pareto-optimally to diverse sets of preferences without the need for further tuning. A major challenge here is using a low-dimensional preference vector to guide the model's behavior, despite it being governed by an overwhelmingly large number of parameters. To address this, Panacea is designed to use singular value decomposition (SVD)-based low-rank adaptation, which allows the preference vector to be simply injected online as singular values.

panacea, pareto alignment, preference adaptation, (4 more...)

Neural Information Processing Systems

May-27-2025, 08:03:12 GMT

Conferences Web Page

Add feedback

Technology:
- Information Technology > Artificial Intelligence
  - Natural Language > Large Language Model (0.58)
  - Representation & Reasoning > Optimization (0.42)