How JEPA Avoids Noisy Features: The Implicit Bias of Deep Linear Self Distillation Networks

Open in new window