AudioLDM 2: Learning Holistic Audio Generation with Self-supervised Pretraining