MANGO: Multimodal Attention-based Normalizing Flow Approach to Fusion Learning

Neural Information Processing Systems 

Multimodal learning has gained much success in recent years. However, current multimodal fusion methods adopt the attention mechanism of Transformers to implicitly learn the underlying correlation of multimodal features.