OpenLex3D: A New Evaluation Benchmark for Open-Vocabulary 3D Scene Representations
Kassab, Christina, Morin, Sacha, Büchner, Martin, Mattamala, Matías, Gupta, Kumaraditya, Valada, Abhinav, Paull, Liam, Fallon, Maurice
–arXiv.org Artificial Intelligence
3D scene understanding has been transformed by open-vocabulary language models that enable interaction via natural language. However, the evaluation of these representations is limited to closed-set semantics that do not capture the richness of language. This work presents OpenLex3D, a dedicated benchmark to evaluate 3D open-vocabulary scene representations. OpenLex3D provides entirely new label annotations for 23 scenes from Replica, ScanNet++, and HM3D, which capture real-world linguistic variability by introducing synonymical object categories and additional nuanced descriptions. By introducing an open-set 3D semantic segmentation task and an object retrieval task, we provide insights on feature precision, segmentation, and downstream capabilities. We evaluate various existing 3D open-vocabulary methods on OpenLex3D, showcasing failure cases, and avenues for improvement. The benchmark is publicly available at: https://openlex3d.github.io/.
arXiv.org Artificial Intelligence
Mar-25-2025
- Country:
- Europe (0.28)
- Genre:
- Research Report (0.50)
- Technology:
- Information Technology > Artificial Intelligence
- Machine Learning (1.00)
- Natural Language > Text Processing (0.93)
- Representation & Reasoning (1.00)
- Vision (1.00)
- Information Technology > Artificial Intelligence