On Mesa-Optimization in Autoregressively Trained Transformers: Emergence and Capability

Neural Information Processing Systems 

When do mesa-optimization algorithms emerge in autoregressively trained transformers?

Similar Docs  Excel Report  more

TitleSimilaritySource
None found