EMOVA: Empowering Language Models to See, Hear and Speak with Vivid Emotions