Mapping Transformer Leveraged Embeddings for Cross-Lingual Document Representation

Tashu, Tsegaye Misikir, Kontos, Eduard-Raul, Sabatelli, Matthia, Valdenegro-Toro, Matias

arXiv.org Artificial Intelligence 

The rapid expansion of online information from diverse sources and the growing multilingual nature of the web underscore the escalating significance of information retrieval (IR) and recommender systems (RS). Today's web is no longer limited to a single language, but is increasingly rich in multiple languages, mirroring the multilingual capacities of its global users Steichen et al. [2014], Tashu et al. [2023]. This diversity highlights the urgent need for cross-lingual recommender systems. Traditional recommender systems often prioritize content in a single language, sidelining a wealth of multilingual documents that may hold valuable insights. This gap leads to the emergence of cross-language information access, where recommender systems suggest items in different languages based on user queries Lops et al. [2010], Narducci et al. [2016], Salamon et al. [2021]. Machine Learning and Deep Learning, which have significantly impacted language representation and processing, are pivotal to enhancing information retrieval and recommender systems, especially in the realm of document recom-The result presented in this work is based on Eduard-Raul Kontos's bachelor project while he was at the University of Groningen