Vision-Language Models as Success Detectors