Transformers Provably Implement In-Context Reinforcement Learning with Policy Improvement

Open in new window