e-commerce platform
Improving Visual Recommendation on E-commerce Platforms Using Vision-Language Models
Yada, Yuki, Akiyama, Sho, Watanabe, Ryo, Ueno, Yuta, Shido, Yusuke, Rusli, Andre
On large-scale e-commerce platforms with tens of millions of active monthly users, recommending visually similar products is essential for enabling users to efficiently discover items that align with their preferences. This study presents the application of a vision-language model (VLM) -- which has demonstrated strong performance in image recognition and image-text retrieval tasks -- to product recommendations on Mercari, a major consumer-to-consumer marketplace used by more than 20 million monthly users in Japan. Specifically, we fine-tuned SigLIP, a VLM employing a sigmoid-based contrastive loss, using one million product image-title pairs from Mercari collected over a three-month period, and developed an image encoder for generating item embeddings used in the recommendation system. Our evaluation comprised an offline analysis of historical interaction logs and an online A/B test in a production environment. In offline analysis, the model achieved a 9.1% improvement in nDCG@5 compared with the baseline. In the online A/B test, the click-through rate improved by 50% whereas the conversion rate improved by 14% compared with the existing model. These results demonstrate the effectiveness of VLM-based encoders for e-commerce product recommendations and provide practical insights into the development of visual similarity-based recommendation systems.
- Asia > Japan (0.24)
- Europe > Czechia > Prague (0.06)
- North America > United States > New York > New York County > New York City (0.04)
- Information Technology > Artificial Intelligence > Vision (1.00)
- Information Technology > Artificial Intelligence > Natural Language (1.00)
- Information Technology > Artificial Intelligence > Machine Learning (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (0.89)
LLP: LLM-based Product Pricing in E-commerce
Wang, Hairu, You, Sheng, Zhang, Qiheng, Xie, Xike, Han, Shuguang, Wu, Yuchen, Huang, Fei, Chen, Jufeng
Unlike Business-to-Consumer e-commerce platforms (e.g., Amazon), inexperienced individual sellers on Consumer-to-Consumer platforms (e.g., eBay) often face significant challenges in setting prices for their second-hand products efficiently. Therefore, numerous studies have been proposed for automating price prediction. However, most of them are based on static regression models, which suffer from poor generalization performance and fail to capture market dynamics (e.g., the price of a used iPhone decreases over time). Inspired by recent breakthroughs in Large Language Models (LLMs), we introduce LLP, the first LLM-based generative framework for second-hand product pricing. LLP first retrieves similar products to better align with the dynamic market change. Afterwards, it leverages the LLMs' nuanced understanding of key pricing information in free-form text to generate accurate price suggestions. To strengthen the LLMs' domain reasoning over retrieved products, we apply a two-stage optimization, supervised fine-tuning (SFT) followed by group relative policy optimization (GRPO), on a dataset built via bidirectional reasoning. Moreover, LLP employs a confidence-based filtering mechanism to reject unreliable price suggestions. Extensive experiments demonstrate that LLP substantially surpasses existing methods while generalizing well to unseen categories. We have successfully deployed LLP on Xianyu\footnote\{Xianyu is China's largest second-hand e-commerce platform.\}, significantly outperforming the previous pricing method. Under the same 30\% product coverage, it raises the static adoption rate (SAR) from 40\% to 72\%, and maintains a strong SAR of 47\% even at 90\% recall.
- North America > United States > New York > New York County > New York City (0.05)
- Asia > China > Zhejiang Province > Hangzhou (0.05)
- North America > United States > Texas > Travis County > Austin (0.04)
- (10 more...)
- Consumer Products & Services (1.00)
- Information Technology > Services > e-Commerce Services (0.92)
- Banking & Finance > Trading (0.88)
GSID: Generative Semantic Indexing for E-Commerce Product Understanding
Yang, Haiyang, Xie, Qinye, Zhang, Qingheng, Chen, Liyu, Zou, Huike, Lian, Chengbao, Han, Shuguang, Huang, Fei, Chen, Jufeng, Zheng, Bo
Structured representation of product information is a major bottleneck for the efficiency of e-commerce platforms, especially in second-hand ecommerce platforms. Currently, most product information are organized based on manually curated product categories and attributes, which often fail to adequately cover long-tail products and do not align well with buyer preference. To address these problems, we propose \textbf{G}enerative \textbf{S}emantic \textbf{I}n\textbf{D}exings (GSID), a data-driven approach to generate product structured representations. GSID consists of two key components: (1) Pre-training on unstructured product metadata to learn in-domain semantic embeddings, and (2) Generating more effective semantic codes tailored for downstream product-centric applications. Extensive experiments are conducted to validate the effectiveness of GSID, and it has been successfully deployed on the real-world e-commerce platform, achieving promising results on product understanding and other downstream tasks.
MEVITA: Open-Source Bipedal Robot Assembled from E-Commerce Components via Sheet Metal Welding
Kawaharazuka, Kento, Sawaguchi, Shogo, Iwata, Ayumu, Yoneda, Keita, Suzuki, Temma, Okada, Kei
Various bipedal robots have been developed to date, and in recent years, there has been a growing trend toward releasing these robots as open-source platforms. This shift is fostering an environment in which anyone can freely develop bipedal robots and share their knowledge, rather than relying solely on commercial products. However, most existing open-source bipedal robots are designed to be fabricated using 3D printers, which limits their scalability in size and often results in fragile structures. On the other hand, some metal-based bipedal robots have been developed, but they typically involve a large number of components, making assembly difficult, and in some cases, the parts themselves are not readily available through e-commerce platforms. To address these issues, we developed MEVITA, an open-source bipedal robot that can be built entirely from components available via e-commerce. Aiming for the minimal viable configuration for a bipedal robot, we utilized sheet metal welding to integrate complex geometries into single parts, thereby significantly reducing the number of components and enabling easy assembly for anyone. Through reinforcement learning in simulation and Sim-to-Real transfer, we demonstrated robust walking behaviors across various environments, confirming the effectiveness of our approach. All hardware, software, and training environments can be obtained from https://github.com/haraduka/mevita .
- Education (0.88)
- Information Technology > Services > e-Commerce Services (0.82)
Generative Modeling with Multi-Instance Reward Learning for E-commerce Creative Optimization
Gu, Qiaolei, Li, Yu, Zeng, DingYi, Wang, Lu, Pang, Ming, Peng, Changping, Lin, Zhangang, Law, Ching, Shao, Jingping
In e-commerce advertising, selecting the most compelling combination of creative elements -- such as titles, images, and highlights -- is critical for capturing user attention and driving conversions. However, existing methods often evaluate creative components individually, failing to navigate the exponentially large search space of possible combinations. To address this challenge, we propose a novel framework named GenCO that integrates generative modeling with multi-instance reward learning. Our unified two-stage architecture first employs a generative model to efficiently produce a diverse set of creative combinations. This generative process is optimized with reinforcement learning, enabling the model to effectively explore and refine its selections. Next, to overcome the challenge of sparse user feedback, a multi-instance learning model attributes combination-level rewards, such as clicks, to the individual creative elements. This allows the reward model to provide a more accurate feedback signal, which in turn guides the generative model toward creating more effective combinations. Deployed on a leading e-commerce platform, our approach has significantly increased advertising revenue, demonstrating its practical value. Additionally, we are releasing a large-scale industrial dataset to facilitate further research in this important domain.
- Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)
- Information Technology > Artificial Intelligence > Natural Language > Generation (0.69)
- Information Technology > Data Science > Data Mining > Big Data (0.68)
SQLord: A Robust Enterprise Text-to-SQL Solution via Reverse Data Generation and Workflow Decomposition
Cheng, Song, Cheng, Qiannan, Jin, Linbo, Yi, Lei, Zhang, Guannan
Transforming natural language into SQL queries (NL2SQL) is crucial for data-driven business applications. Existing frameworks, trained on open-source datasets, struggle with complex business logic and lack domain-specific data for fine-tuning. Additionally, evaluation methods often require annotated data and executable database environments, which are scarce in real-world scenarios. To address these challenges, we propose SQLord, an enterprise-level NL2SQL framework. First, SQLord introduces a data reverse generation approach to convert raw SQL statements into annotated data for supervised fine-tuning (SFT). Second, it proposes a decomposition method for complex queries using an automated workflow generator. Additionally, SQLord features a comprehensive GPT-Judge evaluation framework, including Execution Evaluation (EXE), Query-SQL Evaluation (QSE), and SQL-SQL Evaluation (SSE), tailored to diverse scenarios. Offline tests significantly outperform state of the art baselines, and online accuracy consistently exceeds 90, highlighting SQLord's advantages and effectiveness in complex real world scenarios. SQLord has been successfully applied across multiple scenarios on the world's largest B2B e-commerce platform.
- Oceania > Australia > New South Wales > Sydney (0.06)
- Asia > China > Zhejiang Province > Hangzhou (0.05)
- Europe > Ireland > Leinster > County Dublin > Dublin (0.05)
- (3 more...)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.95)
- Information Technology > Databases (0.91)
- Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.69)
Identifying Offline Metrics that Predict Online Impact: A Pragmatic Strategy for Real-World Recommender Systems
A critical challenge in recommender systems is to establish reliable relationships between offline and online metrics that predict real-world performance. Motivated by recent advances in Pareto front approximation, we introduce a pragmatic strategy for identifying offline metrics that align with online impact. A key advantage of this approach is its ability to simultaneously serve multiple test groups, each with distinct offline performance metrics, in an online experiment controlled by a single model. The method is model-agnostic for systems with a neural network backbone, enabling broad applicability across architectures and domains. We validate the strategy through a large-scale online experiment in the field of session-based recommender systems on the OTTO e-commerce platform. The online experiment identifies significant alignments between offline metrics and real-word click-through rate, post-click conversion rate and units sold. Our strategy provides industry practitioners with a valuable tool for understanding offline-to-online metric relationships and making informed, data-driven decisions.
Real-time and personalized product recommendations for large e-commerce platforms
Tolloso, Matteo, Bacciu, Davide, Mokarizadeh, Shahab, Varesi, Marco
We present a methodology to provide real-time and personalized product recommendations for large e-commerce platforms, specifically focusing on fashion retail. Our approach aims to achieve accurate and scalable recommendations with minimal response times, ensuring user satisfaction, leveraging Graph Neural Networks and parsimonious learning methodologies. Extensive experimentation with datasets from one of the largest e-commerce platforms demonstrates the effectiveness of our approach in forecasting purchase sequences and handling multi-interaction scenarios, achieving efficient personalized recommendations under real-world constraints.
- Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.94)
- Information Technology > Data Science > Data Mining (0.94)
JoyAgents-R1: Joint Evolution Dynamics for Versatile Multi-LLM Agents with Reinforcement Learning
Han, Ai, Hu, Junxing, Wei, Pu, Zhang, Zhiqian, Guo, Yuhang, Lu, Jiawei, Zhang, Zicheng
Multi-agent reinforcement learning (MARL) has emerged as a prominent paradigm for increasingly complex tasks. However, joint evolution across heterogeneous agents remains challenging due to cooperative inefficiency and training instability. In this paper, we propose the joint evolution dynamics for MARL called JoyAgents-R1, which first applies Group Relative Policy Optimization (GRPO) to the joint training of heterogeneous multi-agents. By iteratively refining agents' large language models (LLMs) and memories, the method achieves holistic equilibrium with optimal decision-making and memory capabilities. Specifically, JoyAgents-R1 first implements node-wise Monte Carlo sampling on the behavior of each agent across entire reasoning trajectories to enhance GRPO sampling efficiency while maintaining policy diversity. Then, our marginal benefit-driven selection strategy identifies top-$K$ sampling groups with maximal reward fluctuations, enabling targeted agent model updates that improve training stability and maximize joint benefits through cost-effective parameter adjustments. Meanwhile, JoyAgents-R1 introduces an adaptive memory evolution mechanism that repurposes GRPO rewards as cost-free supervisory signals to eliminate repetitive reasoning and accelerate convergence. Experiments across general and domain-specific scenarios demonstrate that JoyAgents-R1 achieves performance comparable to that of larger LLMs while built on smaller open-source models.
You Are What You Bought: Generating Customer Personas for E-commerce Applications
Shi, Yimin, Fei, Yang, Zhang, Shiqi, Wang, Haixun, Xiao, Xiaokui
In e-commerce, user representations are essential for various applications. Existing methods often use deep learning techniques to convert customer behaviors into implicit embeddings. However, these embeddings are difficult to understand and integrate with external knowledge, limiting the effectiveness of applications such as customer segmentation, search navigation, and product recommendations. To address this, our paper introduces the concept of the customer persona. Condensed from a customer's numerous purchasing histories, a customer persona provides a multi-faceted and human-readable characterization of specific purchase behaviors and preferences, such as Busy Parents or Bargain Hunters. This work then focuses on representing each customer by multiple personas from a predefined set, achieving readable and informative explicit user representations. To this end, we propose an effective and efficient solution GPLR. To ensure effectiveness, GPLR leverages pre-trained LLMs to infer personas for customers. To reduce overhead, GPLR applies LLM-based labeling to only a fraction of users and utilizes a random walk technique to predict personas for the remaining customers. We further propose RevAff, which provides an absolute error $ε$ guarantee while improving the time complexity of the exact solution by a factor of at least $O(\frac{ε\cdot|E|N}{|E|+N\log N})$, where $N$ represents the number of customers and products, and $E$ represents the interactions between them. We evaluate the performance of our persona-based representation in terms of accuracy and robustness for recommendation and customer segmentation tasks using three real-world e-commerce datasets. Most notably, we find that integrating customer persona representations improves the state-of-the-art graph convolution-based recommendation model by up to 12% in terms of NDCG@K and F1-Score@K.
- North America > United States > California > San Francisco County > San Francisco (0.14)
- Europe > Italy (0.05)
- Asia > Singapore > Central Region > Singapore (0.05)
- (2 more...)
- Research Report > New Finding (0.46)
- Research Report > Experimental Study (0.46)