MoTe: Learning Motion-Text Diffusion Model for Multiple Generation Tasks