MINT-CoT: Enabling Interleaved Visual Tokens in Mathematical Chain-of-Thought Reasoning

Jun-12-2026, 12:20:48 GMT–Neural Information Processing Systems

Chain-of-Thought (CoT) has widely enhanced mathematical reasoning in Large Language Models (LLMs), but it still remains challenging for extending it to multimodal domains. Existing works either adopt a similar textual reasoning for image input, or seek to interleave visual signals into mathematical CoT. However, they face three key limitations for math problem-solving: .

artificial intelligence, large language model, natural language, (9 more...)

Neural Information Processing Systems

Jun-12-2026, 12:20:48 GMT

Conferences Web Page

Add feedback

Technology:
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.60)