RO-Bench: Large-scale robustness evaluation of MLLMs with text-driven counterfactual videos