MinMo: A Multimodal Large Language Model for Seamless Voice Interaction