LinVT: Empower Your Image-level Large Language Model to Understand Videos

Open in new window