Preserving Product Fidelity in Large Scale Image Recontextualization with Diffusion Models

Malhi, Ishaan, Dutta, Praneet, Talius, Ellie, Ma, Sally, Driscoll, Brendan, Holden, Krista, Pruthi, Garima, Narayanaswamy, Arunachalam

Mar-10-2025–arXiv.org Artificial Intelligence

Figure 1: Given a few input images of a real world product, our system can generate images that not only maintain high fidelity to the original product, but also recontextualize it in novel settings beyond background changes: from showcasing it in a new perspective, adding object occlusions, to creating different and realistic lighting conditions. We present a framework for high-fidelity product image recontextualization using text-to-image diffusion models and a novel data augmentation pipeline. This pipeline leverages image-to-video diffusion, in/outpainting & negatives to create synthetic training data, addressing limitations of real-world data collection for this task. Our method improves the quality and diversity of generated images by disentangling product representations and enhancing the model's understanding of product characteristics. Evaluation on the ABO dataset and a private product dataset, using automated metrics and human assessment, demonstrates the effectiveness of our framework in generating realistic and compelling product visualizations, with implications for applications such as e-commerce and virtual product showcasing.

artificial intelligence, fidelity, machine learning, (18 more...)

arXiv.org Artificial Intelligence

Mar-10-2025

arXiv.org PDF

Add feedback

Genre:
- Overview > Innovation (0.34)
- Research Report (0.41)

Technology:
- Information Technology > Artificial Intelligence > Machine Learning (1.00)