Switch-BERT: Learning to Model Multimodal Interactions by Switching Attention and Input