PaliGemma: A versatile 3B VLM for transfer

Beyer, Lucas, Steiner, Andreas, Pinto, André Susano, Kolesnikov, Alexander, Wang, Xiao, Salz, Daniel, Neumann, Maxim, Alabdulmohsin, Ibrahim, Tschannen, Michael, Bugliarello, Emanuele, Unterthiner, Thomas, Keysers, Daniel, Koppula, Skanda, Liu, Fangyu, Grycner, Adam, Gritsenko, Alexey, Houlsby, Neil, Kumar, Manoj, Rong, Keran, Eisenschlos, Julian, Kabra, Rishabh, Bauer, Matthias, Bošnjak, Matko, Chen, Xi, Minderer, Matthias, Voigtlaender, Paul, Bica, Ioana, Balazevic, Ivana, Puigcerver, Joan, Papalampidi, Pinelopi, Henaff, Olivier, Xiong, Xi, Soricut, Radu, Harmsen, Jeremiah, Zhai, Xiaohua

Jul-10-2024–arXiv.org Artificial Intelligence

PaliGemma is an open Vision-Language Model (VLM) that is based on the SigLIP-So400m vision encoder and the Gemma-2B language model. It is trained to be a versatile and broadly knowledgeable base model that is effective to transfer. It achieves strong performance on a wide variety of open-world tasks. We evaluate PaliGemma on almost 40 diverse tasks including standard VLM benchmarks, but also more specialized tasks such as remote-sensing and segmentation.

paligemma, resolution, val, (17 more...)

arXiv.org Artificial Intelligence

Jul-10-2024

arXiv.org PDF

Add feedback

Country:
- South America > Colombia
  - Meta Department > Villavicencio (0.04)
- North America
  - United States
    - Maryland > Baltimore (0.04)
    - Tennessee (0.04)
    - Minnesota > Hennepin County
      - Minneapolis (0.14)
    - Hawaii > Honolulu County
      - Honolulu (0.04)
    - Louisiana > Orleans Parish
      - New Orleans (0.04)
    - California > Santa Clara County
      - Palo Alto (0.04)
    - Oregon > Multnomah County
      - Portland (0.04)
    - New Mexico > Bernalillo County
      - Albuquerque (0.04)
    - Georgia > Fulton County
      - Atlanta (0.04)
    - Washington > King County
      - Seattle (0.04)
    - New York > New York County
      - New York City (0.04)
  - Canada > British Columbia
    - Metro Vancouver Regional District > Vancouver (0.04)
- Europe
  - Germany (0.04)
  - Netherlands > North Holland
    - Amsterdam (0.04)
  - Italy
    - Tuscany > Florence (0.04)
    - Calabria > Catanzaro Province
      - Catanzaro (0.04)
  - Ireland > Leinster
    - County Dublin > Dublin (0.04)
  - Belgium > Brussels-Capital Region
    - Brussels (0.04)
- Asia > Middle East
  - UAE (0.04)
  - Qatar > Ad-Dawhah
    - Doha (0.04)
- Africa > Rwanda
  - Kigali > Kigali (0.04)

Genre:
- Research Report > New Finding (0.45)

Industry:
- Education (0.45)
- Energy (0.34)

Technology:
- Information Technology
  - Sensing and Signal Processing > Image Processing (1.00)
  - Artificial Intelligence
    - Vision (1.00)
    - Natural Language
      - Large Language Model (1.00)
      - Chatbot (0.67)
    - Machine Learning > Neural Networks
      - Deep Learning (0.93)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found