Hallucination as an Upper Bound: A New Perspective on Text-to-Image Evaluation
Kasaei, Seyed Amir, Rohban, Mohammad Hossein
–arXiv.org Artificial Intelligence
In language and vision-language models, hallucination is broadly understood as content generated from a model's prior knowledge or biases rather than from the given input. While this phenomenon has been studied in those domains, it has not been clearly framed for text-to-image (T2I) generative models. Existing evaluations mainly focus on alignment, checking whether prompt-specified elements appear, but overlook what the model generates beyond the prompt. We argue for defining hallucination in T2I as bias-driven deviations and propose a taxonomy with three categories: attribute, relation, and object hallucinations. This framing introduces an upper bound for evaluation and surfaces hidden biases, providing a foundation for richer assessment of T2I models.
arXiv.org Artificial Intelligence
Nov-11-2025
- Genre:
- Research Report (0.40)
- Technology:
- Information Technology > Artificial Intelligence
- Machine Learning (0.74)
- Natural Language (1.00)
- Vision (0.97)
- Information Technology > Artificial Intelligence