What's Left? Concept Grounding with Logic-Enhanced Foundation Models

Dec-26-2025, 04:57:00 GMT–Neural Information Processing Systems

Recent works such as VisProg and ViperGPT have smartly composed foundation models for visual reasoning--using large language models (LLMs) to produce programs that can be executed by pre-trained vision-language models. However, they operate in limited domains, such as 2D images, not fully exploiting the generalization of language: abstract concepts like " " can also be grounded in 3D, temporal, and action data, as in moving to your . This limited generalization stems from these inference-only methods' inability to learn or adapt pre-trained models to a new domain.

concept grounding, logic-enhanced foundation model, name change, (3 more...)

Neural Information Processing Systems

Dec-26-2025, 04:57:00 GMT

Conferences Web Page

Add feedback

Technology:
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.63)