Multi-modal Anchor Gated Transformer with Knowledge Distillation for Emotion Recognition in Conversation