CATP: Cross-Attention Token Pruning for Accuracy Preserved Multimodal Model Inference

Open in new window