Do I look like a `cat.n.01` to you? A Taxonomy Image Generation Benchmark

Moskvoretskii, Viktor, Lobanova, Alina, Neminova, Ekaterina, Biemann, Chris, Panchenko, Alexander, Nikishina, Irina

Mar-13-2025–arXiv.org Artificial Intelligence

This paper explores the feasibility of using text-to-image models in a zero-shot setup to generate images for taxonomy concepts. While text-based methods for taxonomy enrichment are well-established, the potential of the visual dimension remains unexplored. To address this, we propose a comprehensive benchmark for Taxonomy Image Generation that assesses models' abilities to understand taxonomy concepts and generate relevant, high-quality images. The benchmark includes common-sense and randomly sampled WordNet concepts, alongside the LLM generated predictions. The 12 models are evaluated using 9 novel taxonomy-related text-to-image metrics and human feedback. Moreover, we pioneer the use of pairwise evaluation with GPT-4 feedback for image generation. Experimental results show that the ranking of models differs significantly from standard T2I tasks. Playground-v2 and FLUX consistently outperform across metrics and subsets and the retrieval-based approach performs poorly. These findings highlight the potential for automating the curation of structured data resources.

large language model, machine learning, natural language, (21 more...)

arXiv.org Artificial Intelligence

Mar-13-2025

arXiv.org PDF

Add feedback

Country:
- Oceania > Australia
  - Victoria > Melbourne (0.04)
- North America
  - United States
    - Washington > King County
      - Seattle (0.04)
    - Louisiana > Orleans Parish
      - New Orleans (0.04)
  - Canada > Ontario
    - Toronto (0.04)
- Europe
  - Switzerland (0.04)
  - Croatia > Dubrovnik-Neretva County
    - Dubrovnik (0.04)
- Asia
  - Thailand > Bangkok
    - Bangkok (0.04)
  - Myanmar > Tanintharyi Region
    - Dawei (0.04)

Genre:
- Research Report
  - New Finding (0.88)
  - Experimental Study (0.68)

Industry:
- Leisure & Entertainment (0.46)

Technology:
- Information Technology > Artificial Intelligence
  - Vision (1.00)
  - Natural Language > Large Language Model (1.00)
  - Machine Learning > Neural Networks
    - Deep Learning (0.35)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found