Reconstruction-Driven Multimodal Representation Learning for Automated Media Understanding

Open in new window