Don't Just Chase " Highlighted Tokens " in MLLMs: Revisiting Visual Holistic Context Retention

Open in new window