Bi-level Doubly Variational Learning for Energy-based Latent Variable Models