Training Chain-of-Thought via Latent-Variable Inference Du Phan Matthew D. Hoffman

Neural Information Processing Systems 

One can also improve LLMs' performance on a specific task by