Rewarded soups: towards Pareto-optimal alignment by interpolating weights fine-tuned on diverse rewards

Feb-17-2026, 14:16:58 GMT–Neural Information Processing Systems

Project lead, main contributor, correspondence to alexandre.rame@isir.upmc.fr. Equal experimental contribution, order determined at random. Further information and resources related to this project can be found on this website.

arxiv preprint, large language model, machine learning, (20 more...)

Neural Information Processing Systems

Feb-17-2026, 14:16:58 GMT

Conferences PDF

Add feedback

Country:
- North America > United States (0.14)
- Africa > Malawi (0.05)
- Europe > France
  - Île-de-France > Paris > Paris (0.04)

Genre:
- Research Report > New Finding (0.92)

Industry:
- Media (0.93)

Technology:
- Information Technology
  - Communications (0.93)
  - Game Theory (0.82)
  - Artificial Intelligence
    - Vision (1.00)
    - Representation & Reasoning > Optimization (0.67)
    - Natural Language
      - Large Language Model (1.00)
      - Chatbot (0.92)
    - Machine Learning
      - Reinforcement Learning (1.00)
      - Neural Networks > Deep Learning (1.00)
      - Statistical Learning (0.92)

Duplicate Docs Excel Report

Title
e12a3b98b67e8395f639fde4c2b03168-Paper-Conference.pdf

Similar Docs Excel Report more

Title	Similarity	Source
None found