The Transient Nature of Emergent In-Context Learning in Transformers Aaditya K. Singh Gatsby Unit, UCL Stephanie C.Y. Chan

Neural Information Processing Systems 

Transformer neural networks can exhibit a surprising capacity for in-context learning (ICL) despite not being explicitly trained for it.