Narrowing Information Bottleneck Theory for Multimodal Image-Text Representations Interpretability