WISE: Weak-Supervision-Guided Step-by-Step Explanations for Multimodal LLMs in Image Classification

Jiang, Yiwen, Mehta, Deval, Yan, Siyuan, Shen, Yaling, Wang, Zimu, Ge, Zongyuan

Sep-23-2025–arXiv.org Artificial Intelligence

Multimodal Large Language Models (MLLMs) have shown promise in visual-textual reasoning, with Multimodal Chain-of-Thought (MCoT) prompting significantly enhancing interpretability. However, existing MCoT methods rely on rationale-rich datasets and largely focus on inter-object reasoning, overlooking the intra-object understanding crucial for image classification. To address this gap, we propose WISE, a Weak-supervision-guided Step-by-step Explanation method that augments any image classification dataset with MCoTs by reformulating the concept-based representations from Concept Bottleneck Models (CBMs) into concise, interpretable reasoning chains under weak supervision. Experiments across ten datasets show that our generated MCoTs not only improve interpretability by 37% but also lead to gains in classification accuracy when used to fine-tune MLLMs. Our work bridges concept-based interpretability and generative MCoT reasoning, providing a generalizable framework for enhancing MLLMs in fine-grained visual understanding.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

Sep-23-2025

arXiv.org PDF

Add feedback

Country:
- Europe > Austria (0.28)
- North America
  - Canada (0.46)
  - United States (0.28)

Genre:
- Research Report (1.00)

Industry:
- Health & Medicine > Therapeutic Area (0.46)

Technology:
- Information Technology
  - Sensing and Signal Processing > Image Processing (1.00)
  - Artificial Intelligence
    - Natural Language > Large Language Model (1.00)
    - Vision > Image Understanding (0.92)
    - Representation & Reasoning > Uncertainty
      - Bayesian Inference (0.68)
    - Machine Learning
      - Performance Analysis > Accuracy (0.67)
      - Learning Graphical Models > Directed Networks
        Bayesian Learning (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found