RIV: Recursive Introspection Mask Diffusion Vision Language Model