Object-Centric Vision Token Pruning for Vision Language Models

Open in new window