Multi-modalSituated Reasoningin3DScenes
–Neural Information Processing Systems
Comprehensiveevaluationson MSQA andMSNN highlight thelimitations ofexisting vision-language models and underscore the importance of handling multi-modal interleaved inputs and situation modeling.
Neural Information Processing Systems
Feb-18-2026, 20:14:01 GMT
- Technology: