Zero-Shot Retrieval for Scalable Visual Search in a Two-Sided Marketplace

Rusli, Andre, Ishimoto, Shoma, Akiyama, Sho, Singh, Aman Kumar

Aug-11-2025–arXiv.org Artificial Intelligence

Visual search offers an intuitive way for customers to explore diverse product catalogs, particularly in consumer-to-consumer (C2C) marketplaces where listings are often unstructured and visually driven. This paper presents a scalable visual search system deployed in Mercari's C2C marketplace, where end-users act as buyers and sellers. We evaluate recent vision-language models for zero-shot image retrieval and compare their performance with an existing fine-tuned baseline. The system integrates real-time inference and background indexing workflows, supported by a unified embedding pipeline optimized through dimensionality reduction. Offline evaluation using user interaction logs shows that the multilingual SigLIP model outperforms other models across multiple retrieval metrics, achieving a 13.3% increase in nDCG@5 over the baseline. A one-week online A/B test in production further confirms real-world impact, with the treatment group showing substantial gains in engagement and conversion, up to a 40.9% increase in transaction rate via image search. Our findings highlight that recent zero-shot models can serve as a strong and practical baseline for production use, which enables teams to deploy effective visual search systems with minimal overhead, while retaining the flexibility to fine-tune based on future data or domain-specific needs.

large language model, machine learning, natural language, (16 more...)

arXiv.org Artificial Intelligence

Aug-11-2025

arXiv.org PDF

Add feedback

Country:
- North America (0.48)
- Asia > Japan
  - Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.15)

Genre:
- Research Report > Experimental Study (0.69)

Industry:
- Information Technology (0.48)

Technology:
- Information Technology
  - Information Management > Search (1.00)
  - Artificial Intelligence
    - Machine Learning (1.00)
    - Natural Language > Large Language Model (0.94)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found