Zero-to-Hero: Enhancing Zero-Shot Novel View Synthesis via Attention Map Filtering

May-26-2025, 21:09:03 GMT–Neural Information Processing Systems

Generating realistic images from arbitrary views based on a single source image remains a significant challenge in computer vision, with broad applications ranging from e-commerce to immersive virtual experiences. Recent advancements in diffusion models, particularly the Zero-1-to-3 model, have been widely adopted for generating plausible views, videos, and 3D models. In this work, we propose Zero-to-Hero, a novel test-time approach that enhances view synthesis by manipulating attention maps during the denoising process of Zero-1-to-3. By drawing an analogy between the denoising process and stochastic gradient descent (SGD), we implement a filtering mechanism that aggregates attention maps, enhancing generation reliability and authenticity. This process improves geometric consistency without requiring retraining or significant computational resources.

large language model, machine learning, natural language, (6 more...)

Neural Information Processing Systems

May-26-2025, 21:09:03 GMT

Conferences Web Page

Add feedback

Technology:
- Information Technology > Artificial Intelligence
  - Natural Language > Large Language Model (0.40)
  - Machine Learning > Statistical Learning
    - Gradient Descent (0.62)