Representation Potentials of Foundation Models for Multimodal Alignment: A Survey

Open in new window