MMAR: AChallenging Benchmark for Deep Reasoning in Speech, Audio, Music, and Their Mix

Jun-17-2026, 17:56:35 GMT–Neural Information Processing Systems

We introduce MMAR, a new benchmark designed to evaluate the deep reasoning capabilities of Audio-Language Models (ALMs) across massive multi-disciplinary tasks. MMAR comprises 1,000 meticulously curated audio-question-answer triplets, collected from real-world internet videos and refined through iterative error corrections and quality checks to ensure high quality. Unlike existing benchmarks that are limited to specific domains of sound, music, or speech, MMAR extends them to a broad spectrum of real-world audio scenarios, including mixedmodality combinations of sound, music, and speech. Each question in MMAR is hierarchically categorized across four reasoning layers: Signal, Perception, Semantic, and Cultural, with additional sub-categories within each layer to reflect task diversity and complexity. To further foster research in this area, we annotate every question with a Chain-of-Thought (CoT) rationale to promote future advancements in audio reasoning.

benchmark, large language model, machine learning, (20 more...)

Neural Information Processing Systems

Jun-17-2026, 17:56:35 GMT

Conferences PDF

Add feedback

Country:
- Asia > China (0.28)
- North America > United States (0.28)

Genre:
- Research Report
  - New Finding (1.00)
  - Experimental Study (1.00)

Industry:
- Media > Music (1.00)
- Leisure & Entertainment (1.00)
- Information Technology > Security & Privacy (0.92)
- Education (0.68)

Technology:
- Information Technology
  - Data Science (1.00)
  - Artificial Intelligence
    - Representation & Reasoning (1.00)
    - Speech > Speech Recognition (0.93)
    - Cognitive Science > Problem Solving (0.66)
    - Natural Language
      - Large Language Model (1.00)
      - Chatbot (0.70)
      - Text Processing (0.67)
    - Machine Learning
      - Neural Networks > Deep Learning (1.00)
      - Performance Analysis > Accuracy (0.68)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found