Explaining multimodal LLMs via intra-modal token interactions

Open in new window