An Inverse Scaling Law for CLIP Training

Open in new window