AITopics | Inductive Learning

Collaborating Authors

Inductive Learning

Inductive learning, or induction, is the process of creating generalizations from individual instances.

News Overviews Instructional Materials AI-Alerts Classics

Can multimodal representation learning by alignment preserve modality-specific information?

Thoreau, Romain, Levillain, Jessie, Derksen, Dawa

arXiv.org Artificial IntelligenceSep-23-2025

Combining multimodal data is a key issue in a wide range of machine learning tasks, including many remote sensing problems. In Earth observation, early multimodal data fusion methods were based on specific neural network architectures and supervised learning. Ever since, the scarcity of labeled data has motivated self-supervised learning techniques. State-of-the-art multimodal representation learning techniques leverage the spatial alignment between satellite data from different modalities acquired over the same geographic area in order to foster a semantic alignment in the latent space. In this paper, we investigate how this methods can preserve task-relevant information that is not shared across modalities. First, we show, under simplifying assumptions, when alignment strategies fundamentally lead to an information loss. Then, we support our theoretical insight through numerical experiments in more realistic settings. With those theoretical and empirical evidences, we hope to support new developments in contrastive learning for the combination of multimodal satellite data. Our code and data is publicly available at https://github.com/Romain3Ch216/alg_maclean_25.

artificial intelligence, information, machine learning, (15 more...)

arXiv.org Artificial Intelligence

2509.17943

Genre: Research Report (0.83)

Industry: Energy (0.35)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.74)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning > Information Fusion (0.54)

Add feedback

Leveraging Audio-Visual Data to Reduce the Multilingual Gap in Self-Supervised Speech Models

Blandón, María Andrea Cruz, Aldeneh, Zakaria, Chi, Jie, de Seyssel, Maureen

arXiv.org Artificial IntelligenceSep-23-2025

ABSTRACT Self-supervised learning (SSL) has made significant advances in speech representation learning. Models like wav2vec 2.0 and HuBERT have achieved state-of-the-art results in tasks such as speech recognition, particularly in monolingual settings. However, multilingual SSL models tend to underperform their monolingual counterparts on each individual language, especially in multilingual scenarios with few languages such as the bilingual setting. In this work, we investigate a novel approach to reduce this performance gap by introducing limited visual grounding into bilingual speech SSL models. Our results show that visual grounding benefits both monolingual and bilingual models, with especially pronounced gains for the latter, reducing the multilingual performance gap on zero-shot phonetic discrimination from 31.5% for audio-only models to 8.04% with grounding.

artificial intelligence, machine learning, natural language, (16 more...)

arXiv.org Artificial Intelligence

2509.17523

Country:

Asia (0.46)
Europe (0.28)

Genre: Research Report > New Finding (0.87)

Technology:

Information Technology > Artificial Intelligence > Natural Language (0.90)
Information Technology > Artificial Intelligence > Speech > Speech Recognition (0.49)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.35)

Add feedback

Self-supervised learning of imaging and clinical signatures using a multimodal joint-embedding predictive architecture

Li, Thomas Z., Krishnan, Aravind R., Zuo, Lianrui, Still, John M., Sandler, Kim L., Maldonado, Fabien, Lasko, Thomas A., Landman, Bennett A.

arXiv.org Artificial IntelligenceSep-22-2025

The development of multimodal models for pulmonary nodule diagnosis is limited by the scarcity of labeled data and the tendency for these models to overfit on the training distribution. In this work, we leverage self-supervised learning from longitudinal and multimodal archives to address these challenges. We curate an unlabeled set of patients with CT scans and linked electronic health records from our home institution to power joint embedding predictive architecture (JEPA) pretraining. After supervised finetuning, we show that our approach outperforms an unregularized multimodal model and imaging-only model in an internal cohort (ours: 0.91, multimodal: 0.88, imaging-only: 0.73 AUC), but underperforms in an external cohort (ours: 0.72, imaging-only: 0.75 AUC). We develop a synthetic environment that characterizes the context in which JEPA may underperform. This work innovates an approach that leverages unlabeled multimodal medical archives to improve predictive models and demonstrates its advantages and limitations in pulmonary nodule diagnosis.

artificial intelligence, inductive learning, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2509.1547

Genre:

Research Report > New Finding (0.46)
Research Report > Experimental Study (0.46)

Industry:

Health & Medicine > Therapeutic Area > Oncology (1.00)
Health & Medicine > Health Care Technology (1.00)
Health & Medicine > Diagnostic Medicine > Imaging (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.71)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)

Add feedback

Optimal Learning from Label Proportions with General Loss Functions

Applebaum, Lorne, Dick, Travis, Gentile, Claudio, Kaplan, Haim, Koren, Tomer

arXiv.org Artificial IntelligenceSep-19-2025

Motivated by problems in online advertising, we address the task of Learning from Label Proportions (LLP). In this partially-supervised setting, training data consists of groups of examples, termed bags, for which we only observe the average label value. The main goal, however, remains the design of a predictor for the labels of individual examples. We introduce a novel and versatile low-variance de-biasing methodology to learn from aggregate label information, significantly advancing the state of the art in LLP. Our approach exhibits remarkable flexibility, seamlessly accommodating a broad spectrum of practically relevant loss functions across both binary and multi-class classification settings. By carefully combining our estimators with standard techniques, we substantially improve sample complexity guarantees for a large class of losses of practical relevance. We also empirically validate the efficacy of our proposed approach across a diverse array of benchmark datasets, demonstrating compelling empirical advantages over standard baselines.

artificial intelligence, inductive learning, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2509.15145

Country: North America (0.15)

Genre: Research Report (0.81)

Industry: Information Technology > Services (0.48)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)

Add feedback

BabyHuBERT: Multilingual Self-Supervised Learning for Segmenting Speakers in Child-Centered Long-Form Recordings

Charlot, Théo, Kunze, Tarek, Poli, Maxime, Cristia, Alejandrina, Dupoux, Emmanuel, Lavechin, Marvin

arXiv.org Artificial IntelligenceSep-19-2025

Child-centered long-form recordings are essential for studying early language development, but existing speech models trained on clean adult data perform poorly due to acoustic and linguistic differences. We introduce BabyHuBERT, the first self-supervised speech representation model trained on 13,000 hours of multilingual child-centered long-form recordings spanning over 40 languages. We evaluate BabyHuBERT on speaker segmentation, identifying when target children speak versus female adults, male adults, or other children -- a fundamental preprocessing step for analyzing naturalistic language experiences. BabyHuBERT achieves F1-scores from 52.1% to 74.4% across six diverse datasets, consistently outperforming W2V2-LL4300 (trained on English long-forms) and standard HuBERT (trained on clean adult speech). Notable improvements include 13.2 absolute F1 points over HuBERT on Vanuatu and 15.9 points on Solomon Islands corpora, demonstrating effectiveness on underrepresented languages. By sharing code and models, BabyHuBERT serves as a foundation model for child speech research, enabling fine-tuning on diverse downstream tasks.

artificial intelligence, long-form recording, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2509.15001

Country:

Europe (0.69)
North America > United States (0.29)
Oceania > Solomon Islands (0.25)
(2 more...)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Speech > Speech Recognition (0.47)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.47)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.41)

Add feedback

Learning to Retrieve for Environmental Knowledge Discovery: An Augmentation-Adaptive Self-Supervised Learning Framework

Luo, Shiyuan, Yu, Runlong, Qiu, Chonghao, Ghosh, Rahul, Ladwig, Robert, Hanson, Paul C., Xie, Yiqun, Jia, Xiaowei

arXiv.org Artificial IntelligenceSep-19-2025

The discovery of environmental knowledge depends on labeled task-specific data, but is often constrained by the high cost of data collection. Existing machine learning approaches usually struggle to generalize in data-sparse or atypical conditions. To this end, we propose an Augmentation-Adaptive Self-Supervised Learning (A$^2$SL) framework, which retrieves relevant observational samples to enhance modeling of the target ecosystem. Specifically, we introduce a multi-level pairwise learning loss to train a scenario encoder that captures varying degrees of similarity among scenarios. These learned similarities drive a retrieval mechanism that supplements a target scenario with relevant data from different locations or time periods. Furthermore, to better handle variable scenarios, particularly under atypical or extreme conditions where traditional models struggle, we design an augmentation-adaptive mechanism that selectively enhances these scenarios through targeted data augmentation. Using freshwater ecosystems as a case study, we evaluate A$^2$SL in modeling water temperature and dissolved oxygen dynamics in real-world lakes. Experimental results show that A$^2$SL significantly improves predictive accuracy and enhances robustness in data-scarce and atypical scenarios. Although this study focuses on freshwater ecosystems, the A$^2$SL framework offers a broadly applicable solution in various scientific domains.

artificial intelligence, inductive learning, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2509.14563

Country: North America > United States (1.00)

Genre: Research Report > New Finding (0.88)

Industry: Government > Regional Government > North America Government > United States Government (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (1.00)

Add feedback

Self-supervised learning on gene expression data

Dradjat, Kevin, Hamidi, Massinissa, Bartet, Pierre, Hanczar, Blaise

arXiv.org Artificial IntelligenceSep-18-2025

Predicting phenotypes from gene expression data is a crucial task in biomedical research, enabling insights into disease mechanisms, drug responses, and personalized medicine. Traditional machine learning and deep learning rely on supervised learning, which requires large quantities of labeled data that are costly and time-consuming to obtain in the case of gene expression data. Self-supervised learning has recently emerged as a promising approach to overcome these limitations by extracting information directly from the structure of unlabeled data. In this study, we investigate the application of state-of-the-art self-supervised learning methods to bulk gene expression data for phenotype prediction. We selected three self-supervised methods, based on different approaches, to assess their ability to exploit the inherent structure of the data and to generate qualitative representations which can be used for downstream predictive tasks. By using several publicly available gene expression datasets, we demonstrate how the selected methods can effectively capture complex information and improve phenotype prediction accuracy. The results obtained show that self-supervised learning methods can outperform traditional supervised models besides offering significant advantage by reducing the dependency on annotated data. We provide a comprehensive analysis of the performance of each method by highlighting their strengths and limitations. We also provide recommendations for using these methods depending on the case under study. Finally, we outline future research directions to enhance the application of self-supervised learning in the field of gene expression data analysis. This study is the first work that deals with bulk RNA-Seq data and self-supervised learning.

artificial intelligence, inductive learning, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2507.13912

Country: Europe > France (0.28)

Genre: Research Report > New Finding (0.67)

Industry:

Health & Medicine > Therapeutic Area > Oncology (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.88)

Add feedback

Contra4: Evaluating Contrastive Cross-Modal Reasoning in Audio, Video, Image, and 3D

Panagopoulou, Artemis, Xue, Le, Zhou, Honglu, savarese, silvio, Xu, Ran, Xiong, Caiming, Callison-Burch, Chris, Yatskar, Mark, Niebles, Juan Carlos

arXiv.org Artificial IntelligenceSep-17-2025

Real-world decision-making often begins with identifying which modality contains the most relevant information for a given query. While recent multimodal models have made impressive progress in processing diverse inputs, it remains unclear whether they can reason contrastively across multiple modalities to select the one that best satisfies a natural language prompt. We argue this capability is foundational, especially in retrieval-augmented and decision-time contexts, where systems must evaluate multiple signals and identify which one conveys the relevant information. To evaluate this skill, we introduce Contra4, a dataset for contrastive cross-modal reasoning across four modalities: image, audio, video, and 3D. Each example presents a natural language question alongside multiple candidate modality instances, and the model must select the one that semantically aligns with the prompt. Contra4 combines human-annotated captions with a mixture-of-models round-trip-consistency filter to ensure high-quality supervision, resulting in 174k training examples and a manually verified test set of 2.3k samples. While task-specific fine-tuning helps improve performance by 56% relative to baseline, state-of-the-art models still achieve only an absolute of 56% accuracy overall and 42% in four-modality settings, underscoring a significant limitation in current multimodal models.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2506.01275

Country: North America > United States (0.46)

Genre: Research Report (1.00)

Industry:

Health & Medicine (0.93)
Government > Regional Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.68)

Add feedback

Deceptive Risk Minimization: Out-of-Distribution Generalization by Deceiving Distribution Shift Detectors

Majumdar, Anirudha

arXiv.org Artificial IntelligenceSep-16-2025

This paper proposes deception as a mechanism for out-of-distribution (OOD) generalization: by learning data representations that make training data appear independent and identically distributed (iid) to an observer, we can identify stable features that eliminate spurious correlations and generalize to unseen domains. We refer to this principle as deceptive risk minimization (DRM) and instantiate it with a practical differentiable objective that simultaneously learns features that eliminate distribution shifts from the perspective of a detector based on conformal martingales while minimizing a task-specific loss. In contrast to domain adaptation or prior invariant representation learning methods, DRM does not require access to test data or a partitioning of training data into a finite number of data-generating domains. We demonstrate the efficacy of DRM on numerical experiments with concept shift and a simulated imitation learning setting with covariate shift in environments that a robot is deployed in.

artificial intelligence, distribution shift, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2509.12081

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.46)

Add feedback

Learning from Uncertain Similarity and Unlabeled Data

Wei, Meng, Li, Zhongnian, Ying, Peng, Xu, Xinzheng

arXiv.org Artificial IntelligenceSep-16-2025

Existing similarity-based weakly supervised learning approaches often rely on precise similarity annotations between data pairs, which may inadvertently expose sensitive label information and raise privacy risks. To mitigate this issue, we propose Uncertain Similarity and Unlabeled Learning (USimUL), a novel framework where each similarity pair is embedded with an uncertainty component to reduce label leakage. In this paper, we propose an unbiased risk estimator that learns from uncertain similarity and unlabeled data. Additionally, we theoretically prove that the estimator achieves statistically optimal parametric convergence rates. Extensive experiments on both benchmark and real-world datasets show that our method achieves superior classification performance compared to conventional similarity-based approaches. Our source code is available at the anonymous link: https://anonymous.4open.science/r/USimUL-B337

artificial intelligence, inductive learning, machine learning, (12 more...)

arXiv.org Artificial Intelligence

2509.11984

Genre: Research Report (1.00)

Industry:

Information Technology > Security & Privacy (1.00)
Health & Medicine (0.93)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Unsupervised or Indirectly Supervised Learning (0.72)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)

Add feedback