Multilingual Entity Linking Using Dense Retrieval
–arXiv.org Artificial Intelligence
Entity linking (EL) is the computational process of connecting textual mentions to corresponding entities. Like many areas of natural language processing, the EL field has greatly benefited from deep learning, leading to significant performance improvements. However, present-day approaches are expensive to train and rely on diverse data sources, complicating their reproducibility. In this thesis, we develop multiple systems that are fast to train, demonstrating that competitive entity linking can be achieved without a large GPU cluster. Moreover, we train on a publicly available dataset, ensuring reproducibility and accessibility. Our models are evaluated for 9 languages giving an accurate overview of their strengths. Furthermore, we offer a~detailed analysis of bi-encoder training hyperparameters, a popular approach in EL, to guide their informed selection. Overall, our work shows that building competitive neural network based EL systems that operate in multiple languages is possible even with limited resources, thus making EL more approachable.
arXiv.org Artificial Intelligence
May-13-2024
- Country:
- South America
- Venezuela (0.04)
- Paraguay > Asunción
- Asunción (0.04)
- Colombia > Meta Department
- Villavicencio (0.04)
- North America
- Mexico > Michoacán (0.04)
- United States
- Minnesota > Hennepin County
- Minneapolis (0.14)
- Louisiana > Orleans Parish
- New Orleans (0.04)
- California > San Francisco County
- San Francisco (0.04)
- Minnesota > Hennepin County
- Canada
- Ontario > Toronto (0.04)
- British Columbia > Metro Vancouver Regional District
- Vancouver (0.04)
- Alberta > Census Division No. 15
- Improvement District No. 9 > Banff (0.04)
- Europe
- Poland (0.04)
- United Kingdom > Scotland (0.04)
- Czechia > Prague (0.04)
- Middle East > Republic of Türkiye
- Istanbul Province > Istanbul (0.04)
- Italy > Tuscany
- Florence (0.04)
- Ireland > Leinster
- County Dublin > Dublin (0.04)
- France > Bourgogne-Franche-Comté
- Croatia > Dubrovnik-Neretva County
- Dubrovnik (0.04)
- Asia
- China > Hong Kong (0.04)
- British Indian Ocean Territory > Diego Garcia (0.04)
- Middle East
- Israel (0.04)
- Jordan (0.04)
- Republic of Türkiye > Istanbul Province
- Istanbul (0.04)
- Japan > Kyūshū & Okinawa
- Kyūshū > Miyazaki Prefecture > Miyazaki (0.04)
- South America
- Genre:
- Research Report > New Finding (0.47)
- Industry:
- Technology: