OmniVL: OneFoundationModelforImage-Language andVideo-Language Tasks

Feb-7-2026, 23:04:40 GMT–Neural Information Processing Systems

This paper presents OmniVL, a new foundation model to support both imagelanguage and video-language tasks using one universal architecture.

machine learning, natural language, wang, (17 more...)

Neural Information Processing Systems

Feb-7-2026, 23:04:40 GMT

Conferences PDF

Technology:
- Information Technology > Artificial Intelligence
  - Vision (1.00)
  - Natural Language (1.00)
  - Representation & Reasoning (0.68)
  - Machine Learning > Neural Networks
    - Deep Learning (0.68)

Duplicate Docs Excel Report

Title
259a5df46308d60f8454bd4adcc3b462-Paper-Conference.pdf

Similar Docs Excel Report more

Title	Similarity	Source
None found