On the Provable Advantage of Unsupervised Pretraining