OPI at SemEval 2023 Task 1: Image-Text Embeddings and Multimodal Information Retrieval for Visual Word Sense Disambiguation

Apr-14-2023–arXiv.org Artificial Intelligence

The goal of visual word sense disambiguation is to find the image that best matches the provided description of the word's meaning. It is a challenging problem, requiring approaches that combine language and image understanding. In this paper, we present our submission to SemEval 2023 visual word sense disambiguation shared task. The proposed system integrates multimodal embeddings, learning to rank methods, and knowledge-based approaches. We build a classifier based on the CLIP model, whose results are enriched with additional information retrieved from Wikipedia and lexical databases. Our solution was ranked third in the multilingual task and won in the Persian track, one of the three language subtasks.

artificial intelligence, information retrieval, natural language, (19 more...)

arXiv.org Artificial Intelligence

Apr-14-2023

arXiv.org PDF

Add feedback

Country:
- North America > Canada
  - Ontario > Toronto (0.04)
- Europe
  - Portugal > Lisbon
    - Lisbon (0.04)
  - Poland > Masovia Province
    - Warsaw (0.04)
  - France > Provence-Alpes-Côte d'Azur
    - Bouches-du-Rhône > Marseille (0.04)
  - Finland > Uusimaa
    - Helsinki (0.04)
- Asia
  - India > Maharashtra
    - Mumbai (0.04)
  - China > Beijing
    - Beijing (0.04)

Genre:
- Research Report (1.00)

Technology:
- Information Technology > Artificial Intelligence
  - Vision > Image Understanding (0.34)
  - Natural Language
    - Information Retrieval (0.41)
    - Text Processing (0.31)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found