Visual Language Models as Zero-Shot Deepfake Detectors