Does Object Binding Naturally Emerge in Large Pretrained Vision Transformers?

Open in new window