More Than Generation: Unifying Generation and Depth Estimation via Text-to-Image Diffusion Models

Neural Information Processing Systems 

Generative depth estimation methods leverage the rich visual priors stored in pretrained text-to-image diffusion models, demonstrating astonishing zero-shot capability.