DeepMEL: A Multi-Agent Collaboration Framework for Multimodal Entity Linking
Wang, Fang, Yan, Tianwei, Yang, Zonghao, Hu, Minghao, Zhang, Jun, Luo, Zhunchen, Bai, Xiaoying
–arXiv.org Artificial Intelligence
Entity linking is a fundamental task in knowledge graph (KG) construction Hofer et al. (2024), aiming to link mentions to their corresponding entities in a target knowledge base (KB). It is widely applied in downstream natural language processing (NLP) tasks, such as Question & Answering Systems Sequeda et al. (2024) and intelligent recommendation systems Chaudhari et al. (2017). Recently, the explosive growth of multimodal data on the Internet has raised challenges, as the quality of online information is often inconsistent, many mentions are ambiguous, and contextual information is frequently incomplete. Under such conditions, relying solely on a single modality (such as pure text) is often insufficient to accurately resolve reference ambiguity Gan et al. (2021). Integrating textual and visual modalities can significantly improve the precision and efficiency of disambiguation Gella et al. (2017). Consequently, multimodal entity linking, which involves combining textual and visual information to link real-world mentions to corresponding entities in a multimodal knowledge graph (MMKG), has become a critical research task. For example, as shown in Figure 1, the mention of "Apple" may be difficult to disambiguate, as it could refer to various entities, such as Apple Inc. or the apple (fruit). However, by considering both textual and visual information, it becomes easier and clearer to accurately link the mention of "Apple" to the entity "apple (fruit of the apple tree)." Currently, multimodal entity linking models are primarily based on deep learning frameworks, utilizing cross-attention mechanisms Lu and Elhamifar (2024) and visual feature encoding techniques Mokssit et al. (2023) to achieve the fusion of textual mentions and visual information.
arXiv.org Artificial Intelligence
Aug-25-2025
- Country:
- Asia
- China
- Beijing > Beijing (0.04)
- Chongqing Province > Chongqing (0.04)
- Guangdong Province > Guangzhou (0.04)
- Middle East > UAE
- Abu Dhabi Emirate > Abu Dhabi (0.04)
- Taiwan > Taiwan Province
- Taipei (0.04)
- Thailand > Bangkok
- Bangkok (0.04)
- China
- Europe
- Czechia > Prague (0.04)
- United Kingdom > Scotland
- City of Edinburgh > Edinburgh (0.04)
- Belgium > Brussels-Capital Region
- Brussels (0.04)
- Ireland > Leinster
- County Dublin > Dublin (0.04)
- Italy
- Piedmont > Turin Province
- Turin (0.04)
- Tuscany > Florence (0.04)
- Piedmont > Turin Province
- France
- Bourgogne-Franche-Comté > Doubs
- Besançon (0.04)
- Provence-Alpes-Côte d'Azur > Bouches-du-Rhône
- Marseille (0.04)
- Île-de-France > Paris
- Paris (0.04)
- Bourgogne-Franche-Comté > Doubs
- Portugal > Lisbon
- Lisbon (0.04)
- Austria > Vienna (0.14)
- Spain > Galicia
- Madrid (0.04)
- North America
- Canada
- Mexico > Mexico City
- Mexico City (0.04)
- United States
- California
- Los Angeles County > Long Beach (0.04)
- San Diego County > San Diego (0.04)
- San Francisco County > San Francisco (0.14)
- Idaho > Ada County
- Boise (0.04)
- Louisiana > Orleans Parish
- New Orleans (0.04)
- Maryland > Baltimore (0.04)
- Minnesota > Hennepin County
- Minneapolis (0.14)
- Pennsylvania
- Allegheny County > Pittsburgh (0.04)
- Philadelphia County > Philadelphia (0.04)
- California
- Oceania > Australia
- South America
- Argentina > Pampas
- Buenos Aires F.D. > Buenos Aires (0.04)
- Chile > Santiago Metropolitan Region
- Santiago Province > Santiago (0.04)
- Colombia > Meta Department
- Villavicencio (0.04)
- Argentina > Pampas
- Asia
- Genre:
- Research Report (1.00)
- Industry:
- Information Technology (0.87)
- Technology: