Learning to Decode the Surface Code with a Recurrent, Transformer-Based Neural Network