Learning Hierarchical Features with Joint Latent Space Energy-Based Prior