A-Bench: Are LMMs Masters at Evaluating AI-generated Images?

Zhang, Zicheng, Wu, Haoning, Li, Chunyi, Zhou, Yingjie, Sun, Wei, Min, Xiongkuo, Chen, Zijian, Liu, Xiaohong, Lin, Weisi, Zhai, Guangtao

Jun-5-2024–arXiv.org Artificial Intelligence

How to accurately and efficiently assess AI-generated images (AIGIs) remains a critical challenge for generative models. Given the high costs and extensive time commitments required for user studies, many researchers have turned towards employing large multi-modal models (LMMs) as AIGI evaluators, the precision and validity of which are still questionable. Furthermore, traditional benchmarks often utilize mostly natural-captured content rather than AIGIs to test the abilities of LMMs, leading to a noticeable gap for AIGIs. Therefore, we introduce A-Bench in this paper, a benchmark designed to diagnose whether LMMs are masters at evaluating AIGIs. Specifically, A-Bench is organized under two key principles: 1) Emphasizing both high-level semantic understanding and low-level visual quality perception to address the intricate demands of AIGIs. 2) Various generative models are utilized for AIGI creation, and various LMMs are employed for evaluation, which ensures a comprehensive validation scope. Ultimately, 2,864 AIGIs from 16 text-to-image models are sampled, each paired with question-answers annotated by human experts, and tested across 18 leading LMMs. We hope that A-Bench will significantly enhance the evaluation process and promote the generation quality for AIGIs. The benchmark is available at https://github.com/Q-Future/A-Bench.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

Jun-5-2024

arXiv.org PDF

Add feedback

Country:
- Asia > China > Shanghai > Shanghai (0.04)

Genre:
- Research Report (0.82)
- Questionnaire & Opinion Survey (0.68)

Technology:
- Information Technology > Artificial Intelligence
  - Vision (1.00)
  - Natural Language
    - Large Language Model (1.00)
    - Chatbot (0.71)
  - Machine Learning > Neural Networks
    - Deep Learning (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found