Scaling Up Parameter Generation: A Recurrent Diffusion Approach

Neural Information Processing Systems 

Parameter generation has long struggled to match the scale of today's large vision and language models, curbing its broader utility. In this paper, we introduce Recurrent Diffusion for Large-Scale Parameter Generation (RPG), a novel framework that generates full neural network parameters--up to hundreds of millions--on a single GPU.