Knowledge Graphs Meet Multi-Modal Learning: A Comprehensive Survey

Chen, Zhuo, Zhang, Yichi, Fang, Yin, Geng, Yuxia, Guo, Lingbing, Chen, Xiang, Li, Qian, Zhang, Wen, Chen, Jiaoyan, Zhu, Yushan, Li, Jiaqi, Liu, Xiaoze, Pan, Jeff Z., Zhang, Ningyu, Chen, Huajun

Feb-9-2024–arXiv.org Artificial Intelligence

Knowledge Graphs (KGs) play a pivotal role in advancing various AI applications, with the semantic web community's exploration into multi-modal dimensions unlocking new avenues for innovation. In this survey, we carefully review over 300 articles, focusing on KG-aware research in two principal aspects: KG-driven Multi-Modal (KG4MM) learning, where KGs support multi-modal tasks, and Multi-Modal Knowledge Graph (MM4KG), which extends KG studies into the MMKG realm. We begin by defining KGs and MMKGs, then explore their construction progress. Our review includes two primary task categories: KG-aware multi-modal learning tasks, such as Image Classification and Visual Question Answering, and intrinsic MMKG tasks like Multi-modal Knowledge Graph Completion and Entity Alignment, highlighting specific research trajectories. For most of these tasks, we provide definitions, evaluation benchmarks, and additionally outline essential insights for conducting relevant research. Finally, we discuss current challenges and identify emerging trends, such as progress in Large Language Modeling and Multi-modal Pre-training strategies. This survey aims to serve as a comprehensive reference for researchers already involved in or considering delving into KG and multi-modal learning research, offering insights into the evolving landscape of MMKG research and supporting future work.

multi-modal data, multi-modal learning, relation extraction, (16 more...)

arXiv.org Artificial Intelligence

Feb-9-2024

arXiv.org PDF

Add feedback

Country:
- Africa (0.04)
- North America > United States
  - New York > New York County
    - New York City (0.04)
  - California
    - San Diego County > San Diego (0.04)
    - Los Angeles County > Long Beach (0.04)
- Europe
  - Spain > Aragón (0.04)
  - Poland (0.04)
  - United Kingdom > England
    - Oxfordshire > Oxford (0.13)
  - France > Bourgogne-Franche-Comté
    - Doubs > Besançon (0.04)
- Asia
  - Singapore (0.04)
  - Middle East > Republic of Türkiye
    - Karaman Province > Karaman (0.04)
  - China > Zhejiang Province
    - Hangzhou (0.04)

Genre:
- Research Report (1.00)
- Overview (1.00)

Industry:
- Leisure & Entertainment > Sports (1.00)
- Information Technology (1.00)
- Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
- Education (1.00)
- Media (0.92)

Technology:
- Information Technology
  - Communications > Web
    - Semantic Web (1.00)
  - Artificial Intelligence
    - Cognitive Science > Problem Solving (1.00)
    - Representation & Reasoning
      - Semantic Networks (1.00)
      - Ontologies (1.00)
      - Information Fusion (1.00)
      - Expert Systems (1.00)
    - Natural Language
      - Text Processing (1.00)
      - Question Answering (1.00)
      - Large Language Model (1.00)
      - Chatbot (1.00)
      - Information Retrieval (0.93)
    - Machine Learning
      - Statistical Learning (1.00)
      - Neural Networks > Deep Learning (1.00)