ZhichunRoad at Amazon KDD Cup 2022: MultiTask Pre-Training for E-Commerce Product Search

Jan-31-2023–arXiv.org Artificial Intelligence

In this paper, we propose a robust multilingual model to improve the quality of search results. Our model not only leverage the processed class-balanced dataset, but also benefit from multitask pre-training that leads to more general representations. In pre-training stage, we adopt mlm task, classification task and contrastive learning task to achieve considerably performance. In fine-tuning stage, we use confident learning, exponential moving average method (EMA), adversarial training (FGM) and regularized dropout strategy (R-Drop) to improve the model's generalization and robustness. Moreover, we use a multi-granular semantic unit to discover the queries and products textual metadata for enhancing the representation of the model. Our approach obtained competitive results and ranked top-8 in three tasks. We release the source code and pre-trained models associated with this work.

artificial intelligence, dataset, machine learning, (14 more...)

arXiv.org Artificial Intelligence

Jan-31-2023

arXiv.org PDF

Add feedback

Country:
- Asia > China (0.15)
- North America > United States (0.16)

Genre:
- Research Report (0.53)

Industry:
- Information Technology > Services > e-Commerce Services (0.52)

Technology:
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.69)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found