MultiMoDN--Multimodal, Multi-Task, Interpretable Modular Networks

Neural Information Processing Systems 

Predicting multiple real-world tasks in a single model often requires a particularly diverse feature space. Multimodal (MM) models aim to extract the synergistic predictive potential of multiple data types to create a shared feature space with aligned semantic meaning across inputs of drastically varying sizes (i.e.