deep learning of segment-level feature representation for speech emotion recognition in conversations