Enhancing LLM Reasoning via Vision-Augmented Prompting
–Neural Information Processing Systems
Verbal and visual-spatial information processing are two critical subsystems that activate different brain regions and often collaborate together for cognitive reasoning. Despite the rapid advancement of LLM-based reasoning, the mainstream frameworks, such as Chain-of-Thought (CoT) and its variants, primarily focus on the verbal dimension, resulting in limitations in tackling reasoning problems with visual and spatial clues. To bridge the gap, we propose a novel dual-modality reasoning framework called Vision-Augmented Prompting (VAP).
Neural Information Processing Systems
May-29-2025, 01:01:56 GMT
- Country:
- Europe > Austria
- Vienna (0.14)
- North America > United States (0.28)
- Europe > Austria
- Genre:
- Research Report
- Experimental Study (1.00)
- New Finding (1.00)
- Workflow (1.00)
- Research Report
- Industry:
- Information Technology > Security & Privacy (0.68)
- Technology:
- Information Technology > Artificial Intelligence
- Cognitive Science (1.00)
- Machine Learning > Neural Networks
- Deep Learning (1.00)
- Natural Language > Large Language Model (1.00)
- Representation & Reasoning (1.00)
- Vision (1.00)
- Information Technology > Artificial Intelligence