VAGEN: Reinforcing World Model Reasoning for Multi-Turn VLM Agents