Iterative Foundation Model Fine-Tuning on Multiple Rewards

Jun-14-2026, 07:20:38 GMT–Neural Information Processing Systems

Fine-tuning foundation models has emerged as a powerful approach for generating objects with specific desired properties. Reinforcement learning (RL) provides an effective framework for this purpose, enabling models to generate outputs that maximize a given reward function. However, in many applications such as text generation and drug discovery, it can be suboptimal to optimize using a single reward signal, as multiple evaluation criteria are often necessary. This paper proposes a novel reinforcement learning-based method for fine-tuning foundation models using multiple reward signals.

machine learning, proceedings, reinforcement learning, (5 more...)

Neural Information Processing Systems

Jun-14-2026, 07:20:38 GMT

Conferences Web Page

Add feedback

Technology:
- Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.52)