Embodied Multimodal Multitask Learning