Strong and Simple Baselines for Multimodal Utterance Embeddings

Open in new window