A Systematic Review of Open Datasets Used in Text-to-Image (T2I) Gen AI Model Safety

Rouf, Rakeen, Bavalatti, Trupti, Ahmed, Osama, Potdar, Dhaval, Jawed, Faraz

Feb-22-2025–arXiv.org Artificial Intelligence

This work is licensed under the Creative Commons Attribution 4.0 International License (CC BY 4.0). For the definitive version, see 10.1109/ACCESS.2025.3539933. Disclaimer: This research involves topics that may include disturbing results. Any explicit content has been redacted, and potentially disturbing results have been presented in a neutral and anonymized manner to minimize emotional distress to the readers. Abstract --Novel research aimed at text-to-image (T2I) generative AI safety often relies on publicly available datasets for training and evaluation, making the quality and composition of these datasets crucial. This paper presents a comprehensive review of the key datasets used in the T2I research, detailing their collection methods, compositions, semantic and syntactic diversity of prompts and the quality, coverage, and distribution of harm types in the datasets. By highlighting the strengths and limitations of the datasets, this study enables researchers to find the most ...

category, dataset, diversity, (14 more...)

arXiv.org Artificial Intelligence

Feb-22-2025

arXiv.org PDF

Add feedback

Country:
- North America
  - Canada (0.04)
  - United States
    - Virginia (0.04)
    - North Carolina > Durham County
      - Durham (0.04)
- Europe > Austria
  - Vienna (0.14)
- Asia
  - Pakistan > Punjab
    - Lahore Division > Lahore (0.04)
  - India
    - Rajasthan > Jaipur (0.04)
    - Maharashtra > Mumbai (0.04)
- Africa > Sudan
  - Khartoum State > Khartoum (0.04)
  - Khartoum (0.04)

Genre:
- Overview (1.00)
- Research Report > New Finding (0.67)

Industry:
- Information Technology (0.87)
- Law Enforcement & Public Safety > Crime Prevention & Enforcement (0.68)
- Health & Medicine (0.68)
- Law > Statutes (0.46)
- Government > Regional Government (0.46)

Technology:
- Information Technology > Artificial Intelligence
  - Vision (1.00)
  - Natural Language
    - Large Language Model (1.00)
    - Chatbot (0.93)
    - Generation (0.88)
  - Machine Learning > Neural Networks
    - Deep Learning > Generative AI (0.88)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found