MAVIS: Multi-Objective Alignment via Value-Guided Inference-Time Search

Carleton, Jeremy, Mukherjee, Debajoy, Shakkottai, Srinivas, Kalathil, Dileep

Aug-21-2025–arXiv.org Artificial Intelligence

Large Language Models (LLMs) are increasingly deployed across diverse applications that demand balancing multiple, often conflicting, objectives -- such as helpfulness, harmlessness, or humor. Aligning outputs to user-specific preferences in such multi-objective settings typically requires fine-tuning models for each objective or preference configuration, which is computationally expensive and inflexible. We introduce MAVIS -- Multi-Objective Alignment via Value-Guided Inference-Time Search -- a lightweight inference-time alignment framework that enables dynamic control over LLM behavior without modifying the base model's weights. MAVIS trains a set of small value models, each corresponding to a distinct objective. At inference time, these value models are combined using user-specified weights to produce a tilting function that adjusts the base model's output distribution toward desired trade-offs. The value models are trained using a simple iterative algorithm that ensures monotonic improvement of the KL-regularized policy. We show empirically that MAVIS outperforms baselines that fine-tune per-objective models and combine them post hoc, and even approaches the performance of the idealized setting where models are fine-tuned for a user's exact preferences.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

Aug-21-2025

arXiv.org PDF

Add feedback

Country:
- North America > United States > Texas (0.28)

Genre:
- Research Report (0.64)
- Instructional Material (0.46)

Technology:
- Information Technology > Artificial Intelligence
  - Natural Language
    - Large Language Model (1.00)
    - Chatbot (1.00)
  - Machine Learning > Neural Networks
    - Deep Learning (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found