Attention Driven Fusion for Multi-Modal Emotion Recognition