AITopics | strong modality

Collaborating Authors

strong modality

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Balanced Multimodal Learning via Mutual Information

Xie, Rongrong, Sanguinetti, Guido

arXiv.org Artificial IntelligenceNov-4-2025

Multimodal learning aims to integrate complementary signals from diverse data types, yet in practice one modality often dominates training when information content, data quality, or sample size are imbalanced. This modality imbalance suppresses the benefits of integration and is especially problematic in biomedical applications such as multi-omics disease subtyping, where cohorts are small and assays vary in noise and coverage. Foundational syntheses emphasize fusion, alignment, and coordination as core challenges, but principled mechanisms that explicitly counter modality imbalance while preserving useful cross-modal structure remain limited [Baltruˇ saitis et al., 2018]. We propose a balanced multimodal framework for multi-omics classification that combines three ideas: (i) graph-based encoders that exploit cross-sample structure; (ii) cross-modal knowledge transfer to strengthen weaker modalities; and (iii) a multitask-style optimization procedure that adaptively reweights unimodal and multimodal losses based on performance signals and cross-modal dependence. Concretely, we employ a revised graph convolutional encoder in which node features may derive from a single modality, while edges are constructed from a fused similarity network across modalities. We then pretrain weaker modalities via knowledge distillation from a stronger teacher to transfer predictive structure without overfitting [Hinton et al., 2015, Furlanello et al., 2018]. Finally, we train the joint model with dynamic loss balancing so that no single modality dictates the gradients, leveraging advances in multitask optimization [Chen et al., 2018, Kendall et al., 2018]. 1

artificial intelligence, machine learning, modality, (14 more...)

arXiv.org Artificial Intelligence

2511.00987

Genre: Research Report > New Finding (0.68)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (0.69)
Health & Medicine > Therapeutic Area > Oncology (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.94)

Add feedback

Rebalanced Vision-Language Retrieval Considering Structure-Aware Distillation

Yang, Yang, Xi, Wenjuan, Zhou, Luping, Tang, Jinhui

arXiv.org Artificial IntelligenceDec-14-2024

Vision-language retrieval aims to search for similar instances in one modality based on queries from another modality. The primary objective is to learn cross-modal matching representations in a latent common space. Actually, the assumption underlying cross-modal matching is modal balance, where each modality contains sufficient information to represent the others. However, noise interference and modality insufficiency often lead to modal imbalance, making it a common phenomenon in practice. The impact of imbalance on retrieval performance remains an open question. In this paper, we first demonstrate that ultimate cross-modal matching is generally sub-optimal for cross-modal retrieval when imbalanced modalities exist. The structure of instances in the common space is inherently influenced when facing imbalanced modalities, posing a challenge to cross-modal similarity measurement. To address this issue, we emphasize the importance of meaningful structure-preserved matching. Accordingly, we propose a simple yet effective method to rebalance cross-modal matching by learning structure-preserved matching representations. Specifically, we design a novel multi-granularity cross-modal matching that incorporates structure-aware distillation alongside the cross-modal matching loss. While the cross-modal matching loss constraints instance-level matching, the structure-aware distillation further regularizes the geometric consistency between learned matching representations and intra-modal representations through the developed relational matching. Extensive experiments on different datasets affirm the superior cross-modal retrieval performance of our approach, simultaneously enhancing single-modal retrieval capabilities compared to the baseline models.

artificial intelligence, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2412.10761

Country:

Oceania > Australia > New South Wales > Sydney (0.14)
Asia > China > Jiangsu Province > Nanjing (0.05)
North America > United States (0.04)
(2 more...)

Genre: Research Report (1.00)

Industry: Education (1.00)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)

Add feedback

Auxiliary Information Regularized Machine for Multiple Modality Feature Learning

Yang, Yang (Nanjing University) | Ye, Han-Jia (Nanjing University) | Zhan, De-Chuan (Nanjing University) | Jiang, Yuan (Nanjing University)

AAAI ConferencesJul-15-2015

It is notable In real world applications, data are often with multiple that strong modal features can lead to a better performance, modalities. Previous works assumed that each nevertheless, are more expensive, therefore a group of serialized modality contains sufficient information for target feature extraction methods were proposed. These methods and can be treated with equal importance. However, extract weak modal features firstly, and then extract more it is often that different modalities are of various strong modal features gradually to improve the performance importance in real tasks, e.g., the facial feature and reduce the overall cost as well. Marcialis et al.[2010] proposed is weak modality and the fingerprint feature is a serial fusion technique for multiple biometric modal strong modality in ID recognition. In this paper, we features through extracting gaits information and face information point out that different modalities should be treated step by step; Zhang et al.[2014] addressed the serialized with different strategies and propose the Auxiliary multi-modal learning techniques in a semi-supervised information Regularized Machine (ARM), which learning scenario. These methods handle strong and weak works by extracting the most discriminative feature modalities independently while leaving the fact of unsatisfied subspace of weak modality while regularizing the performance on weak modality unexplained.

modality, strong modality, weak modality, (14 more...)

AAAI Conferences

Twenty-Fourth International Joint Conference on Artificial Intelligence

Country:

North America > United States > Texas (0.05)
Asia > China > Beijing > Beijing (0.05)
North America > Canada > Quebec (0.04)
(10 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback