Self-supervised Multi-view Disentanglement for Expansion of Visual Collections

Jain, Nihal, Vaddamanu, Praneetha, Maheshwari, Paridhi, Vinay, Vishwa, Kulkarni, Kuldeep

Feb-4-2023–arXiv.org Artificial Intelligence

Image search engines enable the retrieval of images relevant to a query image. In this work, we consider the setting where a query for similar images is derived from a collection of images. For visual search, the similarity measurements may be made along multiple axes, or views, such as style and color. We assume access to a set of feature extractors, each of which computes representations for a specific view. Our objective is to design a retrieval algorithm that effectively combines similarities computed over representations from multiple views. To this end, we propose a self-supervised learning method for extracting disentangled view-specific representations for images such that the inter-view overlap is minimized. We show how this allows us to compute the intent of a collection as a distribution over views. We show how effective retrieval can be performed by prioritizing candidate expansion images that match the intent of a query collection. Finally, we present a new querying mechanism for image search enabled by composing multiple collections and perform retrieval under this setting using the techniques presented in this paper.

information retrieval, machine learning, pattern recognition, (17 more...)

arXiv.org Artificial Intelligence

Feb-4-2023

arXiv.org PDF

Add feedback

Country:
- Oceania > Australia (0.04)
- North America
  - United States
    - Wisconsin > Dane County
      - Madison (0.04)
    - Pennsylvania > Allegheny County
      - Pittsburgh (0.14)
    - New York > New York County
      - New York City (0.04)
    - California
      - San Diego County > San Diego (0.04)
      - Santa Clara County
        Stanford (0.04)
        Palo Alto (0.04)
  - Canada > Quebec
    - Montreal (0.04)
- Asia
  - Middle East > Israel
    - Haifa District > Haifa (0.04)
  - India > Karnataka
    - Bengaluru (0.04)
- Africa > Central African Republic
  - Ombella-M'Poko > Bimbo (0.04)

Genre:
- Research Report (0.64)

Technology:
- Information Technology
  - Sensing and Signal Processing > Image Processing (1.00)
  - Artificial Intelligence
    - Natural Language > Information Retrieval (0.48)
    - Machine Learning
      - Pattern Recognition > Image Matching (0.54)
      - Inductive Learning (0.54)
      - Statistical Learning (0.46)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found