Enhancing Agentic Autonomous Scientific Discovery with Vision-Language Model Capabilities