Transformers on Markov data: Constant depth suffices

Open in new window