Instruction-tuned Self-Questioning Framework for Multimodal Reasoning