Lexicon3D: Probing Visual Foundation Models for Complex 3D Scene Understanding

Mar-21-2026, 13:15:18 GMT–Neural Information Processing Systems

Complex 3D scene understanding has gained increasing attention, with scene encoding strategies built on top of visual foundation models playing a crucial role in this success. However, the optimal scene encoding strategies for various scenarios remain unclear, particularly compared to their image-based counterparts. To address this issue, we present the first comprehensive study that probes various visual encoding models for 3D scene understanding, identifying the strengths and limitations of each model across different scenarios.

artificial intelligence, name change, proceedings, (4 more...)

Neural Information Processing Systems

Mar-21-2026, 13:15:18 GMT

Conferences Web Page

Add feedback

Technology:
- Information Technology > Artificial Intelligence > Vision (0.72)