VARP: Reinforcement Learning from Vision-Language Model Feedback with Agent Regularized Preferences

Open in new window