MaXIFE: Multilingual and Cross-lingual Instruction Following Evaluation

Liu, Yile, Ma, Ziwei, Jiang, Xiu, Hu, Jinglu, Chang, Jing, Li, Liang

Jun-4-2025–arXiv.org Artificial Intelligence

With the rapid adoption of large language models (LLMs) in natural language processing, the ability to follow instructions has emerged as a key metric for evaluating their practical utility. However, existing evaluation methods often focus on single-language scenarios, overlooking the challenges and differences present in multilingual and cross-lingual contexts. To address this gap, we introduce MaXIFE: a comprehensive evaluation benchmark designed to assess instruction-following capabilities across 23 different languages with 1667 verifiable instruction tasks. MaXIFE integrates both Rule-Based Evaluation and Model-Based Evaluation, ensuring a balance of efficiency and accuracy. We applied MaXIFE to evaluate several leading commercial LLMs, establishing baseline results for future comparisons. By providing a standardized tool for multilingual instruction-following evaluation, MaXIFE aims to advance research and development in natural language processing.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

Jun-4-2025

arXiv.org PDF

Add feedback

Country:
- Asia (0.67)
- North America > Mexico (0.27)

Genre:
- Research Report (0.81)

Industry:
- Education (0.67)
- Leisure & Entertainment (0.45)
- Information Technology > Security & Privacy (0.45)

Technology:
- Information Technology > Artificial Intelligence
  - Natural Language
    - Large Language Model (1.00)
    - Chatbot (1.00)
  - Machine Learning > Neural Networks
    - Deep Learning (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found