Multi-Step Visual Reasoning with Visual Tokens Scaling and Verification

Open in new window