Implicit-Knowledge Visual Question Answering with Structured Reasoning Traces
Wen, Zhihao, Wei, Wenkang, Fang, Yuan, Yu, Xingtong, Zhang, Hui, Zhu, Weicheng, Zhang, Xin
–arXiv.org Artificial Intelligence
Knowledge-based Visual Question Answering (KVQA) requires models to ground entities in images and reason over factual knowledge. Recent work has introduced its implicit-knowledge variant, IK-KVQA, where a multimodal large language model (MLLM) is the sole knowledge source and answers are produced without external retrieval. Existing IK-KVQA approaches, however, are typically trained with answer-only supervision: reasoning remains implicit, justifications are often weak or inconsistent, and generalization after standard supervised fine-tuning (SFT) can be brittle. We propose MODELNAME, a framework that equips IK-KVQA with dual-path structured reasoning traces (symbolic relation paths over text and vision together with path-grounded natural-language explanations) to provide a stronger inductive bias than generic answer-only supervision. These traces act as modality-aware scaffolds that guide the model toward relevant entities and attributes, offering more structure than generic chain-of-thought supervision while not constraining reasoning to any single fixed path. Using a single open-source MLLM, MODELNAME constructs and selects traces to build an offline trace-enriched dataset and then performs structure-aware self-distillation; no external retrievers, verifiers, or curated knowledge bases are used, and inference is a single autoregressive pass. Across benchmarks, MODELNAME consistently improves both answer accuracy and the transparency of intermediate reasoning, achieving up to 11.3% higher answer accuracy on OK-VQA over the strongest baseline.
arXiv.org Artificial Intelligence
Nov-18-2025
- Country:
- Africa
- Eritrea (0.04)
- Middle East
- Sudan (0.04)
- Asia
- China (0.05)
- Middle East
- Iraq (0.04)
- Saudi Arabia > Arabian Gulf (0.05)
- Yemen (0.04)
- Singapore (0.40)
- Thailand > Bangkok
- Bangkok (0.04)
- Atlantic Ocean > South Atlantic Ocean (0.04)
- Indian Ocean
- Arabian Gulf (0.05)
- Red Sea (0.04)
- South America > Falkland Islands (0.04)
- Africa
- Genre:
- Research Report (0.40)
- Industry:
- Transportation (0.46)
- Technology: