Uncovering mesa-optimization algorithms in Transformers

Open in new window