Multimodal Iterative RAG for Knowledge-Intensive Visual Question Answering