Deciphering Cross-Modal Alignment in Large Vision-Language Models with Modality Integration Rate

Open in new window