Exploring Vision Language Models for Multimodal and Multilingual Stance Detection