SAIL-RL: Guiding MLLMs in When and How to Think via Dual-Reward RL Tuning

Open in new window