Rethinking generative image pretraining: How far are we from scaling up next-pixel prediction?

Open in new window