A survey on knowledge-enhanced multimodal learning

Feb-15-2023, 09:22:17 GMT–AIHub

Multimodal learning is a field of increasing interest in the research community, as it is more closely aligned to the way a human perceives the world: a combination of visual information, language, sounds, and other senses provides complementary insights regarding the world state. Significant advancements in unimodal learning, such as the advent of transformers, boosted the capabilities of multimodal approaches, not only in terms of task-specific performance but also regarding the ability to develop multi-task models. Nevertheless, even such powerful multimodal approaches present shortcomings when it comes to reasoning beyond before-seen knowledge, even if that knowledge refers to simple everyday situations such as "in very cold temperatures the water freezes". This is where external knowledge sources can contribute to enhance model performance by providing such pieces of missing information. The term "knowledge-enhanced" refers to any model utilizing external (or even internal) knowledge sources to extend their predictive capabilities beyond the knowledge that can be extracted from datasets learned during the training phase.

dataset, information, knowledge, (15 more...)

AIHub

Feb-15-2023, 09:22:17 GMT

News Web Page

Add feedback

Technology:
- Information Technology > Artificial Intelligence
  - Representation & Reasoning (1.00)
  - Natural Language > Large Language Model (1.00)
  - Machine Learning > Neural Networks (1.00)
  - Cognitive Science > Problem Solving (0.89)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found