Explaining latent representations of generative models with large multimodal models