Decoding-Time Language Model Alignment with Multiple Objectives Yifang Chen

Mar-20-2025, 18:07:07 GMT–Neural Information Processing Systems

Aligning language models (LMs) to human preferences has emerged as a critical pursuit, enabling these models to better serve diverse user needs. Existing methods primarily focus on optimizing LMs for a single reward function, limiting their adaptability to varied objectives. Here, we propose multi-objective decoding (MOD), a decoding-time algorithm that outputs the next token from a linear combination of predictions of all base models, for any given weighting over different objectives. We exploit a common form among a family of f-divergence regularized alignment approaches (such as PPO, DPO, and their variants) to identify a closed-form solution by Legendre transform, and derive an efficient decoding strategy. Theoretically, we show why existing approaches can be sub-optimal even in natural settings and obtain optimality guarantees for our method.

large language model, machine learning, natural language, (21 more...)

Neural Information Processing Systems

Mar-20-2025, 18:07:07 GMT

Conferences PDF

Add feedback

Country:
- Asia > Middle East
  - UAE (0.14)
- North America > United States
  - Hawaii (0.14)
  - Maryland (0.14)

Genre:
- Research Report > Experimental Study (0.92)

Industry:
- Energy > Oil & Gas
  - Downstream (0.93)
- Government (1.00)
- Information Technology > Security & Privacy (1.00)
- Law (0.67)
- Materials (0.67)

Technology:
- Information Technology
  - Artificial Intelligence
    - Machine Learning > Neural Networks
      - Deep Learning (0.46)
    - Natural Language > Large Language Model (0.68)
    - Representation & Reasoning > Optimization (0.68)
  - Communications > Social Media (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found