OmniBench: Towards The Future of Universal Omni-Language Models
–Neural Information Processing Systems
Recent advancements in multimodal large language models (MLLMs) have aimed to integrate and interpret data across diverse modalities. However, the capacity of these models to concurrently process and reason about multiple modalities remains underexplored, partly due to the lack of comprehensive modality-wise benchmarks. We introduce OmniBench, a novel benchmark designed to rigorously evaluate models' ability to recognize, interpret, and reason across visual, acoustic, and textual inputs simultaneously. We define language models capable of such tri-modal processing as the omni-language models (OLMs). OmniBench is distinguished by high-quality human annotations, ensuring that accurate responses require integrated understanding and reasoning across all three modalities.
Neural Information Processing Systems
Jun-15-2026, 22:30:19 GMT
- Genre:
- Research Report
- New Finding (1.00)
- Experimental Study (1.00)
- Research Report
- Industry:
- Media > Music (1.00)
- Leisure & Entertainment (1.00)
- Information Technology > Security & Privacy (1.00)
- Education (0.93)
- Technology: