Where's Waldo: Diffusion Features For Personalized Segmentation and Retrieval
–Neural Information Processing Systems
Personalized retrieval and segmentation aim to locate specific instances within a dataset based on an input image and a short description of the reference instance. While supervised methods are effective, they require extensive labeled data for training. Recently, self-supervised foundation models have been introduced to these tasks showing comparable results to supervised methods. However, a significant flaw in these models is evident: they struggle to locate a desired instance when other instances within the same class are presented. In this paper, we explore text-to-image diffusion models for these tasks. Specifically, we propose a novel approach called PDM for Personalized Diffusion Features Matching, that leverages intermediate features of pre-trained text-to-image models for personalization tasks without any additional training. PDM demonstrates superior performance on popular retrieval and segmentation benchmarks, outperforming even super-Correspondence to: Dvir Samuel .
Neural Information Processing Systems
Mar-27-2025, 13:07:05 GMT
- Country:
- Asia > Middle East > Israel (0.14)
- Genre:
- Research Report
- Experimental Study (0.93)
- New Finding (1.00)
- Research Report
- Technology: