AITopics | joint representation

Neural Information Processing Systems http://nips.cc/

arxiv preprint arxiv, machine learning, natural language, (14 more...)

Neural Information Processing Systems

Genre: Research Report (0.47)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.71)

Add feedback

Generalizable Multi-Linear Attention Network

Neural Information Processing SystemsApr-25-2026, 18:55:21 GMT

The majority of existing multimodal sequential learning methods focus on how to obtain powerful individual representations and neglect to effectively capture the multimodal joint representation. Bilinear attention network (BAN) is a commonly used integration method, which leverages tensor operations to associate the features of different modalities. However, BAN has a poor compatibility for more modalities, since the computational complexity of the attention map increases exponentially with the number of modalities. Based on this concern, we propose a new method called generalizable multi-linear attention network (MAN), which can associate more modalities in acceptable complexity with hierarchical approximation decomposition. Specifically, considering the fact that softmax attention kernels cannot be decomposed as linear operation directly, we adopt the addition random features mechanism to approximate the non-linear softmax functions with enough theoretical analysis. Furthermore, we also introduce the local sequential constraints, which can be combined with ARF conveniently, as positional information. We conduct extensive experiments on several datasets of corresponding tasks, the experimental results show that MAN could achieve competitive results compared with baseline methods, showcasing the effectiveness of our contributions.

artificial intelligence, machine learning, natural language, (18 more...)

Neural Information Processing Systems

Genre: Research Report (0.48)

Industry: Leisure & Entertainment (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.56)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.49)

Add feedback

Multimodal Residual Learning for Visual QA

Neural Information Processing SystemsMar-17-2026, 10:02:11 GMT

Deep neural networks continue to advance the state-of-the-art of image recognition tasks with various methods. However, applications of these methods to multimodality remain limited. We present Multimodal Residual Networks (MRN) for the multimodal residual learning of visual question-answering, which extends the idea of the deep residual learning. Unlike the deep residual learning, MRN effectively learns the joint representation from visual and language information. The main idea is to use element-wise multiplication for the joint residual mappings exploiting the residual learning of the attentional models in recent studies. Various alternative models introduced by multimodality are explored based on our study. We achieve the state-of-the-art results on the Visual QA dataset for both Open-Ended and Multiple-Choice tasks. Moreover, we introduce a novel method to visualize the attention effect of the joint representations for each learning block using back-propagation algorithm, even though the visual features are collapsed without spatial information.

artificial intelligence, machine learning, proceedings, (6 more...)

Neural Information Processing Systems

Genre: Research Report (0.61)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.98)

Add feedback

50d6dbc809b0dc96f7f1090810537acc-Supplemental-Conference.pdf

Neural Information Processing SystemsFeb-11-2026, 18:17:09 GMT

artificial intelligence, machine learning, setup, (18 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

50d6dbc809b0dc96f7f1090810537acc-Paper-Conference.pdf

Neural Information Processing SystemsFeb-11-2026, 18:17:06 GMT

artificial intelligence, attractor, machine learning, (18 more...)

Neural Information Processing Systems

Country: Asia > Middle East > Israel (0.04)

Genre: Research Report > New Finding (0.46)

Industry: Health & Medicine > Therapeutic Area > Neurology (0.30)

Technology:

Information Technology > Artificial Intelligence > Cognitive Science (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.49)

Add feedback

GeneralizedFocalLoss: LearningQualifiedand DistributedBoundingBoxesfor DenseObjectDetection

Neural Information Processing SystemsFeb-11-2026, 01:33:19 GMT

We then propose Generalized Focal Loss (GFL) that generalizes Focal Loss from itsdiscrete form tothecontinuous versionforsuccessful optimization.

artificial intelligence, machine learning, representation, (17 more...)

Neural Information Processing Systems

Country: North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

GeneralizableMulti-LinearAttentionNetwork

Neural Information Processing SystemsFeb-8-2026, 13:26:05 GMT

The majority of existing multimodal sequential learning methods focus on how to obtain powerful individual representations and neglect to effectively capture themultimodal joint representation. Bilinear attention network (BAN) isacommonly used integration method, which leverages tensor operations to associate thefeatures ofdifferent modalities.

artificial intelligence, machine learning, transformer, (18 more...)

Neural Information Processing Systems

Country: Asia > China (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Multimodal Residual Learning for Visual QA

Neural Information Processing SystemsNov-21-2025, 15:06:48 GMT

Deep neural networks continue to advance the state-of-the-art of image recognition tasks with various methods. However, applications of these methods to multimodality remain limited. We present Multimodal Residual Networks (MRN) for the multimodal residual learning of visual question-answering, which extends the idea of the deep residual learning. Unlike the deep residual learning, MRN effectively learns the joint representation from visual and language information. The main idea is to use element-wise multiplication for the joint residual mappings exploiting the residual learning of the attentional models in recent studies. Various alternative models introduced by multimodality are explored based on our study. We achieve the state-of-the-art results on the Visual QA dataset for both Open-Ended and Multiple-Choice tasks. Moreover, we introduce a novel method to visualize the attention effect of the joint representations for each learning block using back-propagation algorithm, even though the visual features are collapsed without spatial information.

multimodal residual learning, name change, residual learning, (5 more...)

Neural Information Processing Systems

Genre: Research Report (0.61)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.98)

Add feedback

Bilinear Attention Networks

Jin-Hwa Kim, Jaehyun Jun, Byoung-Tak Zhang

Neural Information Processing SystemsNov-20-2025, 18:32:02 GMT

Neural Information Processing Systems http://nips.cc/

artificial intelligence, machine learning, natural language, (14 more...)

Neural Information Processing Systems

Country:

Asia > South Korea > Seoul > Seoul (0.04)
North America > Canada > Quebec > Montreal (0.04)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Multi-Modal Continual Learning via Cross-Modality Adapters and Representation Alignment with Knowledge Preservation

Chee, Evelyn, Hsu, Wynne, Lee, Mong Li

arXiv.org Artificial IntelligenceNov-11-2025

Continual learning is essential for adapting models to new tasks while retaining previously acquired knowledge. While existing approaches predominantly focus on uni-modal data, multi-modal learning offers substantial benefits by utilizing diverse sensory inputs, akin to human perception. However, multi-modal continual learning presents additional challenges, as the model must effectively integrate new information from various modalities while preventing catastrophic forgetting. In this work, we propose a pre-trained model-based framework for multi-modal continual learning. Our framework includes a novel cross-modality adapter with a mixture-of-experts structure to facilitate effective integration of multi-modal information across tasks. We also introduce a representation alignment loss that fosters learning of robust multi-modal representations, and regularize relationships between learned representations to preserve knowledge from previous tasks. Experiments on several multi-modal datasets demonstrate that our approach consistently outperforms baselines in both class-incremental and domain-incremental learning, achieving higher accuracy and reduced forgetting.

artificial intelligence, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

doi: 10.3233/FAIA250918

2511.06723

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Filters

Collaborating Authors

joint representation

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

Multimodal Residual Learning for Visual QA

Generalizable Multi-Linear Attention Network

Multimodal Residual Learning for Visual QA

50d6dbc809b0dc96f7f1090810537acc-Supplemental-Conference.pdf

50d6dbc809b0dc96f7f1090810537acc-Paper-Conference.pdf

GeneralizedFocalLoss: LearningQualifiedand DistributedBoundingBoxesfor DenseObjectDetection

GeneralizableMulti-LinearAttentionNetwork

Multimodal Residual Learning for Visual QA

Bilinear Attention Networks

Multi-Modal Continual Learning via Cross-Modality Adapters and Representation Alignment with Knowledge Preservation