OThink-MR1: Stimulating multimodal generalized reasoning capabilities through dynamic reinforcement learning

Open in new window