State Space Prompting via Gathering and Spreading Spatio-Temporal Information for Video Understanding

Open in new window