Specialization after Generalization: Towards Understanding Test-Time Training in Foundation Models

Open in new window