AToMiC: An Image/Text Retrieval Test Collection to Support Multimedia Content Creation

Yang, Jheng-Hong, Lassance, Carlos, de Rezende, Rafael Sampaio, Srinivasan, Krishna, Redi, Miriam, Clinchant, Stéphane, Lin, Jimmy

Apr-4-2023–arXiv.org Artificial Intelligence

This paper presents the AToMiC (Authoring Tools for Multimedia Content) dataset, designed to advance research in image/text cross-modal retrieval. While vision-language pretrained transformers have led to significant improvements in retrieval effectiveness, existing research has relied on image-caption datasets that feature only simplistic image-text relationships and underspecified user models of retrieval tasks. To address the gap between these oversimplified settings and real-world applications for multimedia content creation, we introduce a new approach for building retrieval test collections. We leverage hierarchical structures and diverse domains of texts, styles, and types of images, as well as large-scale image-document associations embedded in Wikipedia. We formulate two tasks based on a realistic user model and validate our dataset through retrieval experiments using baseline models. AToMiC offers a testbed for scalable, diverse, and reproducible multimedia retrieval research. Finally, the dataset provides the basis for a dedicated track at the 2023 Text Retrieval Conference (TREC), and is publicly available at https://github.com/TREC-AToMiC/AToMiC.

artificial intelligence, machine learning, natural language, (21 more...)

arXiv.org Artificial Intelligence

Apr-4-2023

arXiv.org PDF

Add feedback

Country:
- Asia > Middle East
  - UAE (0.04)
- Europe
  - France (0.04)
  - United Kingdom (0.14)
- North America
  - Canada (0.04)
  - United States > California (0.04)

Genre:
- Research Report > New Finding (0.93)

Industry:
- Information Technology (0.46)

Technology:
- Information Technology
  - Artificial Intelligence
    - Machine Learning > Neural Networks (0.46)
    - Natural Language (1.00)
    - Vision (1.00)
  - Communications > Social Media (1.00)
  - Human Computer Interaction (0.88)
  - Sensing and Signal Processing > Image Processing (0.93)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found