Pixel-Perfect Depth with Semantics-Prompted Diffusion Transformers
–Neural Information Processing Systems
Current generative depth estimation models fine-tune Stable Diffusion and achieve impressive performance. However, they require a VAE to compress depth maps into the latent space, which inevitably introduces flying pixels at edges and details.
Neural Information Processing Systems
Jun-14-2026, 08:16:59 GMT
- Technology:
- Information Technology > Artificial Intelligence
- Vision (0.49)
- Machine Learning (0.40)
- Information Technology > Artificial Intelligence