Musketeer (All for One, and One for All): A Generalist Vision-Language Model with Task Explanation Prompts

Zhang, Zhaoyang, Shen, Yantao, Shi, Kunyu, Cai, Zhaowei, Fang, Jun, Deng, Siqi, Yang, Hao, Modolo, Davide, Tu, Zhuowen, Soatto, Stefano

May-11-2023–arXiv.org Artificial Intelligence

We present a sequence-to-sequence vision-language model whose parameters are jointly trained on all tasks (all for one) and fully shared among multiple tasks (one for all), resulting in a single model which we named Musketeer. The integration of knowledge across heterogeneous tasks is enabled by a novel feature called Task Explanation Prompt (TEP). TEP reduces interference among tasks, allowing the model to focus on their shared structure. With a single model, Musketeer achieves results comparable to or better than strong baselines trained on single tasks, almost uniformly across multiple tasks.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

May-11-2023

arXiv.org PDF

Add feedback

Country:
- Asia
  - China > Hong Kong (0.04)
  - Middle East > Israel
    - Tel Aviv District > Tel Aviv (0.04)

Genre:
- Research Report (1.00)

Technology:
- Information Technology
  - Sensing and Signal Processing > Image Processing (1.00)
  - Artificial Intelligence
    - Vision (1.00)
    - Natural Language
      - Large Language Model (0.94)
      - Text Processing (0.67)
    - Machine Learning > Neural Networks
      - Deep Learning (0.68)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found