Beyond Training: Dynamic Token Merging for Zero-Shot Video Understanding

Open in new window