A Spectral Energy Distance for Parallel Speech Synthesis