Multimodal speech synthesis architecture for unsupervised speaker adaptation