Joint Unsupervised and Supervised Training for Automatic Speech Recognition via Bilevel Optimization