EMMA-X: An EM-like Multilingual Pre-training Algorithm for Cross-lingual Representation Learning
–Neural Information Processing Systems
Expressing universal semantics common to all languages is helpful in understanding the meanings of complex and culture-specific sentences. The research theme underlying this scenario focuses on learning universal representations across languages with the usage of massive parallel corpora. However, due to the sparsity and scarcity of parallel data, there is still a big challenge in learning authentic "universals" for any two languages. In this paper, we propose EMMA-X: an EM-like Multilingual pre-training Algorithm, to learn (X)Cross-lingual universals with the aid of excessive multilingual non-parallel data.
Neural Information Processing Systems
Apr-25-2026, 18:37:27 GMT
- Country:
- Europe (1.00)
- North America > United States
- Minnesota (0.28)
- Industry:
- Government (0.67)
- Law (0.46)
- Technology: