MoTE: Reconciling Generalization with Specialization for Visual-Language to Video Knowledge Transfer

Open in new window