JourneyDB: A Benchmark for Generative Image Understanding
–Neural Information Processing Systems
While recent advancements in vision-language models have had a transformative impact on multi-modal comprehension, the extent to which these models possess the ability to comprehend generated images remains uncertain. Synthetic images, in comparison to real data, encompass a higher level of diversity in terms of both content and style, thereby presenting significant challenges for the models to fully grasp. In light of this challenge, we introduce a comprehensive dataset, referred to as JourneyDB, that caters to the domain of generative images within the context of multi-modal visual understanding. Our meticulously curated dataset comprises 4 million distinct and high-quality generated images, each paired with the corresponding text prompts that were employed in their creation. Furthermore, we additionally introduce an external subset with results of another 22 text-to-image generative models, which makes JourneyDB a comprehensive benchmark for evaluating the comprehension of generated images.
Neural Information Processing Systems
Feb-11-2025, 06:19:00 GMT
- Country:
- Asia
- China (0.28)
- Middle East (0.28)
- Asia
- Industry:
- Leisure & Entertainment (0.93)
- Media (0.67)
- Technology: