VARP: Reinforcement Learning from Vision-Language Model Feedback with Agent Regularized Preferences