An Image Grid Can Be Worth a Video: Zero-shot Video Question Answering Using a VLM