SCOPE: Saliency-Coverage Oriented Token Pruning for Efficient Multimodel LLMs
–Neural Information Processing Systems
Multimodal Large Language Models (MLLMs) typically process a large number of visual tokens, leading to considerable computational overhead, even though many of these tokens are redundant. Existing visual token pruning methods primarily focus on selecting the most salient tokens based on attention scores, resulting in the semantic incompleteness of the selected tokens.
Neural Information Processing Systems
Jun-14-2026, 07:01:13 GMT
- Technology: