Self-supervised Multi-view Disentanglement for Expansion of Visual Collections
Jain, Nihal, Vaddamanu, Praneetha, Maheshwari, Paridhi, Vinay, Vishwa, Kulkarni, Kuldeep
–arXiv.org Artificial Intelligence
Image search engines enable the retrieval of images relevant to a query image. In this work, we consider the setting where a query for similar images is derived from a collection of images. For visual search, the similarity measurements may be made along multiple axes, or views, such as style and color. We assume access to a set of feature extractors, each of which computes representations for a specific view. Our objective is to design a retrieval algorithm that effectively combines similarities computed over representations from multiple views. To this end, we propose a self-supervised learning method for extracting disentangled view-specific representations for images such that the inter-view overlap is minimized. We show how this allows us to compute the intent of a collection as a distribution over views. We show how effective retrieval can be performed by prioritizing candidate expansion images that match the intent of a query collection. Finally, we present a new querying mechanism for image search enabled by composing multiple collections and perform retrieval under this setting using the techniques presented in this paper.
arXiv.org Artificial Intelligence
Feb-4-2023
- Country:
- Oceania > Australia (0.04)
- North America
- United States
- Wisconsin > Dane County
- Madison (0.04)
- Pennsylvania > Allegheny County
- Pittsburgh (0.14)
- New York > New York County
- New York City (0.04)
- California
- Wisconsin > Dane County
- Canada > Quebec
- Montreal (0.04)
- United States
- Asia
- Middle East > Israel
- Haifa District > Haifa (0.04)
- India > Karnataka
- Bengaluru (0.04)
- Middle East > Israel
- Africa > Central African Republic
- Ombella-M'Poko > Bimbo (0.04)
- Genre:
- Research Report (0.64)
- Technology: