AITopics | benchmarking generative model

Collaborating Authors

benchmarking generative model

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Benchmarking Generative Models on Computational Thinking Tests in Elementary Visual Programming

Neural Information Processing SystemsMay-27-2025, 04:33:27 GMT

Generative models have demonstrated human-level proficiency in various benchmarks across domains like programming, natural sciences, and general knowledge. Despite these promising results on competitive benchmarks, they still struggle with seemingly simple problem-solving tasks typically carried out by elementary-level students. How do state-of-the-art models perform on standardized programming-related tests designed to assess computational thinking and problem-solving skills at schools? In this paper, we curate a novel benchmark involving computational thinking tests grounded in elementary visual programming domains. Our initial results show that state-of-the-art models like GPT-4o and Llama3 barely match the performance of an average school student.

benchmarking generative model, large language model, machine learning, (8 more...)

Neural Information Processing Systems

Genre: Research Report (0.87)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Generation (0.69)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.63)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.63)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.63)

Add feedback

The ArtBench Dataset: Benchmarking Generative Models with Artworks - Technology Org

#artificialintelligenceJun-26-2022, 15:45:19 GMT

Deep generative models can synthesize diverse and high-fidelity images. Computational understanding of art attracts more and more attention because of its importance for art history, computational creativity and human-computer interaction. The new research proposes the idea to use art for the purposes of benchmarking generative AI models. The dataset is composed of 60,000 images annotated with 10 artistic styles such as Baroque or Surrealism. The images are of high-quality with clean and balanced labels and can be easily incorporated in commonly used deep learning frameworks.

artbench dataset, artbench-10, benchmarking generative model, (2 more...)

#artificialintelligence

Technology:

Information Technology > Artificial Intelligence > Natural Language > Generation (0.98)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.59)

Add feedback