Evaluating Multimodal Generative AI with Korean Educational Standards

Feb-21-2025–arXiv.org Artificial Intelligence

This paper presents the Korean National Educational Test Benchmark (KoNET), a new benchmark designed to evaluate Multimodal Generative AI Systems using Korean national educational tests. KoNET comprises four exams: the Korean Elementary General Educational Development Test (KoEGED), Middle (KoMGED), High (KoHGED), and College Scholastic Ability Test (KoCSAT). These exams are renowned for their rigorous standards and diverse questions, facilitating a comprehensive analysis of AI performance across different educational levels. By focusing on Korean, KoNET provides insights into model performance in less-explored languages. We assess a range of models - open-source, open-access, and closed APIs - by examining difficulties, subject diversity, and human error rates. The code and dataset builder will be made fully open-sourced at https://github.com/naver-ai/KoNET.

benchmark, error rate, zhang, (14 more...)

arXiv.org Artificial Intelligence

Feb-21-2025

arXiv.org PDF

Add feedback

Country:
- South America > Chile
  - Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
- Asia
  - South Korea (0.04)
  - East Asia (0.04)
  - British Indian Ocean Territory > Diego Garcia (0.04)
  - Thailand > Bangkok
    - Bangkok (0.04)
  - Middle East
    - Israel (0.04)
    - Saudi Arabia > Asir Province
      - Abha (0.04)

Genre:
- Research Report (0.82)

Industry:
- Education
  - Educational Setting (1.00)
  - Curriculum > Subject-Specific Education (0.46)
  - Assessment & Standards > Educational Standards (0.40)

Technology:
- Information Technology > Artificial Intelligence
  - Natural Language
    - Large Language Model (1.00)
    - Chatbot (0.95)
  - Machine Learning > Neural Networks
    - Deep Learning > Generative AI (0.71)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found