StyleTTS: A Style-Based Generative Model for Natural and Diverse Text-to-Speech Synthesis