Visual Anchors Are Strong Information Aggregators For Multimodal Large Language Model

Open in new window