Discrete Acoustic Space for an Efficient Sampling in Neural Text-To-Speech