GSID: Generative Semantic Indexing for E-Commerce Product Understanding

Yang, Haiyang, Xie, Qinye, Zhang, Qingheng, Chen, Liyu, Zou, Huike, Lian, Chengbao, Han, Shuguang, Huang, Fei, Chen, Jufeng, Zheng, Bo

arXiv.org Artificial Intelligence 

Structured representation of product information is a major bottleneck for the efficiency of e-commerce platforms, especially in second-hand ecommerce platforms. Currently, most product information are organized based on manually curated product categories and attributes, which often fail to adequately cover long-tail products and do not align well with buyer preference. To address these problems, we propose \textbf{G}enerative \textbf{S}emantic \textbf{I}n\textbf{D}exings (GSID), a data-driven approach to generate product structured representations. GSID consists of two key components: (1) Pre-training on unstructured product metadata to learn in-domain semantic embeddings, and (2) Generating more effective semantic codes tailored for downstream product-centric applications. Extensive experiments are conducted to validate the effectiveness of GSID, and it has been successfully deployed on the real-world e-commerce platform, achieving promising results on product understanding and other downstream tasks.

Duplicate Docs Excel Report

Title
None found

Similar Docs  Excel Report  more

TitleSimilaritySource
None found