Reward Generation via Large Vision-Language Model in Offline Reinforcement Learning

Lee, Younghwan, Luu, Tung M., Lee, Donghoon, Yoo, Chang D.

Apr-15-2025–arXiv.org Artificial Intelligence

Reward Generation via Large Vision-Language Model in Offline Reinforcement Learning Y ounghwan Lee Electrical Engineering KAIST Daejeon, South Korea youngh2@kaist.ac.kr Chang D. Y oo Electrical Engineering KAIST Daejeon, South Korea cd yoo@kaist.ac.kr Abstract --In offline reinforcement learning (RL), learning from fixed datasets presents a promising solution for domains where real-time interaction with the environment is expensive or risky. However, designing dense reward signals for offline dataset requires significant human effort and domain expertise. Reinforcement learning with human feedback (RLHF) has emerged as an alternative, but it remains costly due to the human-in-the-loop process, prompting interest in automated reward generation models. T o address this, we propose Reward Generation via Large Vision-Language Models (RG-VLM), which leverages the reasoning capabilities of L VLMs to generate rewards from offline data without human involvement.

artificial intelligence, machine learning, reinforcement learning, (14 more...)

arXiv.org Artificial Intelligence

Apr-15-2025

arXiv.org PDF

Add feedback

Country:
- Asia > South Korea > Daejeon > Daejeon (0.45)

Genre:
- Research Report > Promising Solution (0.66)

Technology:
- Information Technology > Artificial Intelligence > Machine Learning
  - Neural Networks > Deep Learning (0.50)
  - Reinforcement Learning (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found