Compass-v3: Scaling Domain-Specific LLMs for Multilingual E-Commerce in Southeast Asia
–arXiv.org Artificial Intelligence
Large language models (LLMs) excel in general-domain applications, yet their performance often degrades in specialized tasks requiring domain-specific knowledge. E-commerce is particularly challenging, as its data are noisy, heterogeneous, multilingual, and highly dynamic. We present Compass-v3, a vertical-domain Mixture-of-Experts (MoE) model with 245B total parameters and 71B active per token, designed for Southeast Asian e-commerce. Compass-v3 adopts fewer but larger experts, combined with hardware-efficient optimizations-such as intra-node expert parallelism and a customized memcpy operator-to maximize GPU utilization. The model is trained on 12T tokens of curated multilingual corpora and large-scale synthetic e-commerce instructions using a mixed-training strategy. To enhance alignment, we propose Optimal-Transport Direct Preference Optimization (OTPO), which captures token-level distinctions and improves instruction adherence in commerce-specific scenarios. Extensive evaluations demonstrate that Compass-v3 delivers state-of-the-art e-commerce performance, surpassing DeepSeek-V3.1, GPT-4 series, and Qwen3-235B. Moreover, Compass-v3 demonstrates strong multilingual capability across low-resource Southeast Asian languages (Indonesian, Thai, Filipino, Vietnamese, Malay, Taglog) and Portuguese while sustaining competitive performance on general benchmarks. It has already been widely applied in Shopee's industrial-scale e-commerce platform and is gradually replacing OpenAI's traffic, now accounting for over 70\% of total LLM usage, highlighting its dual strengths in specialized commerce expertise and broad linguistic competence.
arXiv.org Artificial Intelligence
Sep-12-2025
- Country:
- Asia
- Malaysia (0.04)
- Philippines (0.04)
- Singapore (0.04)
- Southeast Asia (0.43)
- Thailand (0.04)
- Vietnam (0.04)
- Europe
- France (0.04)
- Italy > Calabria
- Catanzaro Province > Catanzaro (0.04)
- North America > United States (0.04)
- South America > Brazil (0.04)
- Asia
- Genre:
- Research Report (1.00)
- Industry:
- Information Technology > Services > e-Commerce Services (1.00)
- Technology: