The Importance of Multimodal Emotion Conditioning and Affect Consistency for Embodied Conversational Agents