AITopics | Inductive Learning

Collaborating Authors

Inductive Learning

Inductive learning, or induction, is the process of creating generalizations from individual instances.

News Overviews Instructional Materials AI-Alerts Classics

Speaker Diarization for Low-Resource Languages Through Wav2vec Fine-Tuning

Abdullah, Abdulhady Abas, Karim, Sarkhel H. Taher, Ahmed, Sara Azad, Tariq, Kanar R., Rashid, Tarik A.

arXiv.org Artificial IntelligenceApr-29-2025

Speaker diarization, a core problem in speech processing, entails partitioning a given audio stream according to the speakers. Even though progress has been made in the development of the models for high - resource languages, there is still a set of specific difficulties in going through a similar process for low - resource languages such as Kurdish: there are very few annotated datasets available; the language has dialects; speakers use code - switching a lot. These challenges are met in this study by training the Wav2V ec 2.0 SSL model on a Ku rdish dataset prepared for this purpose. Thanks to transfer learning, it was possible to transfer multiling ual representations learnt in other languages to the phonetic and acoustic features of Kurdish speech. The general Diarization Error Rate (DER) was reduced by 7.2%, and the cluster purity increased by 13% when compared to the baseline algorithm. They show that making improvements in any state - of - the - art model can help in enhancing the performance of under - resourced languages. Implications of this work include transcription services for Kurdish - language media programs, as well as speaker segmentation in multilingual call centers, teleconferencing, and videoconferencing systems. Therefore, this work demonstrates that self - supervised and transfer techniques can improve speaker diarization for Kurdish and other low - resource languages with diverse features. The approach provides a ba se for building effective diarization systems in other understudied languages, which remai ns essential for speech technology's equity.

diarization, machine learning, natural language, (16 more...)

arXiv.org Artificial Intelligence

2504.18582

Country: Asia > Middle East > Iraq > Kurdistan Region (0.28)

Genre: Research Report > New Finding (0.88)

Industry:

Health & Medicine (0.68)
Education (0.67)
Media (0.48)

Technology:

Information Technology > Artificial Intelligence > Speech > Speech Recognition (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.68)

Add feedback

Efficient Self-Supervised Learning for Earth Observation via Dynamic Dataset Curation

Kerdreux, Thomas, Tuel, Alexandre, Febvre, Quentin, Mouche, Alexis, Chapron, Bertrand

arXiv.org Artificial IntelligenceApr-29-2025

Self-supervised learning (SSL) has enabled the development of vision foundation models for Earth Observation (EO), demonstrating strong transferability across diverse remote sensing tasks. While prior work has focused on network architectures and training strategies, the role of dataset curation, especially in balancing and diversifying pre-training datasets, remains underexplored. In EO, this challenge is amplified by the redundancy and heavy-tailed distributions common in satellite imagery, which can lead to biased representations and inefficient training. In this work, we propose a dynamic dataset pruning strategy designed to improve SSL pre-training by maximizing dataset diversity and balance. Our method iteratively refines the training set without requiring a pre-existing feature extractor, making it well-suited for domains where curated datasets are limited or unavailable. We demonstrate our approach on the Sentinel-1 Wave Mode (WV) Synthetic Aperture Radar (SAR) archive, a challenging dataset dominated by ocean observations. We train models from scratch on the entire Sentinel-1 WV archive spanning 10 years. Across three downstream tasks, our results show that dynamic pruning improves both computational efficiency and representation quality, leading to stronger transferability. We also release the weights of OceanSAR-1, the first model in the OceanSAR family, a series of foundation models for ocean observation and analysis using SAR imagery, at github.com/galeio-research/OceanSAR-models/.

artificial intelligence, inductive learning, machine learning, (15 more...)

arXiv.org Artificial Intelligence

2504.06962

Country:

Europe > France (0.28)
North America > United States (0.28)

Genre: Research Report > New Finding (1.00)

Industry: Energy > Renewable > Geothermal > Geothermal Energy Exploration and Development > Geophysical Analysis & Survey (0.56)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.62)

Add feedback

SSLR: A Semi-Supervised Learning Method for Isolated Sign Language Recognition

Algafri, Hasan, Luqman, Hamzah, Alyami, Sarah, Laradji, Issam

arXiv.org Artificial IntelligenceApr-24-2025

Sign language is the primary communication language for people with disabling hearing loss. Sign language recognition (SLR) systems aim to recognize sign gestures and translate them into spoken language. One of the main challenges in SLR is the scarcity of annotated datasets. T o address this issue, we propose a semi-supervised learning (SSL) approach for SLR (SSLR), employing a pseudo-label method to annotate unlabeled samples. The sign gestures are represented using pose information that encodes the signer's skeletal joint points. This information is used as input for the Transformer backbone model utilized in the proposed approach. T o demonstrate the learning capabilities of SSL across various labeled data sizes, several experiments were conducted using different percentages of labeled data with varying numbers of classes. The performance of the SSL approach was compared with a fully supervised learning-based model on the WLASL-100 dataset. The obtained results of the SSL model outperformed the supervised learning-based model with less labeled data in many cases.

artificial intelligence, language recognition, machine learning, (14 more...)

arXiv.org Artificial Intelligence

2504.1664

Genre: Research Report (1.00)

Industry:

Education > Curriculum > Subject-Specific Education (0.92)
Health & Medicine (0.88)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (1.00)

Add feedback

A Self-supervised Learning Method for Raman Spectroscopy based on Masked Autoencoders

Ren, Pengju, Zhou, Ri-gui, Li, Yaochong

arXiv.org Artificial IntelligenceApr-24-2025

Raman spectroscopy serves as a powerful and reliable tool for analyzing the chemical information of substances. The integration of Raman spectroscopy with deep learning methods enables rapid qualitative and quantitative analysis of materials. Most existing approaches adopt supervised learning methods. Although supervised learning has achieved satisfactory accuracy in spectral analysis, it is still constrained by costly and limited well-annotated spectral datasets for training. When spectral annotation is challenging or the amount of annotated data is insufficient, the performance of supervised learning in spectral material identification declines. In order to address the challenge of feature extraction from unannotated spectra, we propose a self-supervised learning paradigm for Raman Spectroscopy based on a Masked AutoEncoder, termed SMAE. SMAE does not require any spectral annotations during pre-training. By randomly masking and then reconstructing the spectral information, the model learns essential spectral features. The reconstructed spectra exhibit certain denoising properties, improving the signal-to-noise ratio (SNR) by more than twofold. Utilizing the network weights obtained from masked pre-training, SMAE achieves clustering accuracy of over 80% for 30 classes of isolated bacteria in a pathogenic bacterial dataset, demonstrating significant improvements compared to classical unsupervised methods and other state-of-the-art deep clustering methods. After fine-tuning the network with a limited amount of annotated data, SMAE achieves an identification accuracy of 83.90% on the test set, presenting competitive performance against the supervised ResNet (83.40%).

artificial intelligence, inductive learning, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2504.1613

Country:

North America (0.46)
Asia > China (0.15)

Genre: Research Report (0.82)

Industry:

Health & Medicine > Therapeutic Area > Oncology (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (0.94)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (1.00)

Add feedback

MAGIC: Near-Optimal Data Attribution for Deep Learning

Ilyas, Andrew, Engstrom, Logan

arXiv.org Machine LearningApr-23-2025

A fundamental problem when building machine learning syste ms is to predict counterfactuals about model behavior. For example, scaling laws [ KMH+20; Has21; MRB+23 ] aim to predict the performance of systems trained with more data and more co mpute than is currently available; interpretability techniques [ KWG+18 ] predict how models behave under counterfactual inputs. Analogously, in this work we study predictive data attribution (or datamodeling [ IPE+22 ]), where the goal is to predict how a model would behave if it had been tr ained on a different dataset. This well-studied problem encompasses, e.g., estimating the ef fect (on the resulting trained model's predictions) of modifying a training example [ KL17 ], removing a group of training examples [ KAT+19; BNL+22; PGI+23 ], or adding entire training data sources [ LSZ+24 ]. Predictive data attribution in large-scale settings is cha llenging: it requires simulating training a model on a different dataset without actually training [ GWP+23; IGE+24 ]. In "classical" settings--when learning corresponds to minimizing a convex loss--statistical tools like the influence function [ Ham47 ] allow us to accurately and efficiently estimate how differen t training data choices change trained model predictions [ RM18; KAT+19; GSL+19 ]. However, in the non-convex settings that are ubiquitous in natural domains like langua ge/vision, current methods are less effective. Indeed, the best existing methods produce estimat es that typically (a) only moderately correlate with the ground truth [ BPF21; BNL+22; PGI+23 ] and (b) incur large absolute error [ BNL+22 ].

artificial intelligence, inductive learning, machine learning, (17 more...)

arXiv.org Machine Learning

2504.1643

Country:

Asia > Middle East > Jordan (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.95)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.83)

Add feedback

Describe Anything: Detailed Localized Image and Video Captioning

Lian, Long, Ding, Yifan, Ge, Yunhao, Liu, Sifei, Mao, Hanzi, Li, Boyi, Pavone, Marco, Liu, Ming-Yu, Darrell, Trevor, Yala, Adam, Cui, Yin

arXiv.org Artificial IntelligenceApr-23-2025

Generating detailed and accurate descriptions for specific regions in images and videos remains a fundamental challenge for vision-language models. We introduce the Describe Anything Model (DAM), a model designed for detailed localized captioning (DLC). DAM preserves both local details and global context through two key innovations: a focal prompt, which ensures high-resolution encoding of targeted regions, and a localized vision backbone, which integrates precise localization with its broader context. To tackle the scarcity of high-quality DLC data, we propose a Semi-supervised learning (SSL)-based Data Pipeline (DLC-SDP). DLC-SDP starts with existing segmentation datasets and expands to unlabeled web images using SSL. We introduce DLC-Bench, a benchmark designed to evaluate DLC without relying on reference captions. DAM sets new state-of-the-art on 7 benchmarks spanning keyword-level, phrase-level, and detailed multi-sentence localized image and video captioning.

caption, large language model, machine learning, (20 more...)

arXiv.org Artificial Intelligence

2504.16072

Genre: Research Report (0.40)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
(3 more...)

Add feedback

Invariant Learning with Annotation-free Environments

Le, Phuong Quynh, Seifert, Christin, Schlötterer, Jörg

arXiv.org Artificial IntelligenceApr-23-2025

Invariant learning is a promising approach to improve domain generalization compared to Empirical Risk Minimization (ERM). However, most invariant learning methods rely on the assumption that training examples are pre-partitioned into different known environments. We instead infer environments without the need for additional annotations, motivated by observations of the properties within the representation space of a trained ERM model. We show the preliminary effectiveness of our approach on the ColoredMNIST benchmark, achieving performance comparable to methods requiring explicit environment labels and on par with an annotation-free method that poses strong restrictions on the ERM reference model.

artificial intelligence, inductive learning, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2504.15686

Genre: Research Report (0.70)

Industry: Health & Medicine > Therapeutic Area (0.48)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.68)

Add feedback

Locate 3D: Real-World Object Localization via Self-Supervised Learning in 3D

Arnaud, Sergio, McVay, Paul, Martin, Ada, Majumdar, Arjun, Jatavallabhula, Krishna Murthy, Thomas, Phillip, Partsey, Ruslan, Dugas, Daniel, Gejji, Abha, Sax, Alexander, Berges, Vincent-Pierre, Henaff, Mikael, Jain, Ayush, Cao, Ang, Prasad, Ishita, Kalakrishnan, Mrinal, Rabbat, Michael, Ballas, Nicolas, Assran, Mido, Maksymets, Oleksandr, Rajeswaran, Aravind, Meier, Franziska

arXiv.org Artificial IntelligenceApr-22-2025

We present LOCATE 3D, a model for localizing objects in 3D scenes from referring expressions like "the small coffee table between the sofa and the lamp." LOCATE 3D sets a new state-of-the-art on standard referential grounding benchmarks and showcases robust generalization capabilities. Notably, LOCATE 3D operates directly on sensor observation streams (posed RGB-D frames), enabling real-world deployment on robots and AR devices. Key to our approach is 3D-JEPA, a novel self-supervised learning (SSL) algorithm applicable to sensor point clouds. It takes as input a 3D pointcloud featurized using 2D foundation models (CLIP, DINO). Subsequently, masked prediction in latent space is employed as a pretext task to aid the self-supervised learning of contextualized pointcloud features. Once trained, the 3D-JEPA encoder is finetuned alongside a language-conditioned decoder to jointly predict 3D masks and bounding boxes. Additionally, we introduce LOCATE 3D DATASET, a new dataset for 3D referential grounding, spanning multiple capture setups with over 130K annotations. This enables a systematic study of generalization capabilities as well as a stronger model.

large language model, machine learning, natural language, (22 more...)

arXiv.org Artificial Intelligence

2504.14151

Country: Asia (0.28)

Genre: Research Report > New Finding (1.00)

Industry:

Appliances & Durable Goods (0.34)
Information Technology (0.34)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.90)

Add feedback

Spectral Algorithms under Covariate Shift

Fan, Jun, Guo, Zheng-Chu, Shi, Lei

arXiv.org Machine LearningApr-17-2025

Spectral algorithms leverage spectral regularization techniques to analyze and process data, providing a flexible framework for addressing supervised learning problems. To deepen our understanding of their performance in real-world scenarios where the distributions of training and test data may differ, we conduct a rigorous investigation into the convergence behavior of spectral algorithms under distribution shifts, specifically within the framework of reproducing kernel Hilbert spaces. Our study focuses on the case of covariate shift. In this scenario, the marginal distributions of the input data differ between the training and test datasets, while the conditional distribution of the output given the input remains unchanged. Under this setting, we analyze the generalization error of spectral algorithms and show that they achieve minimax optimality when the density ratios between the training and test distributions are uniformly bounded. However, we also identify a critical limitation: when the density ratios are unbounded, the spectral algorithms may become suboptimal. To address this limitation, we propose a weighted spectral algorithm that incorporates density ratio information into the learning process. Our theoretical analysis shows that this weighted approach achieves optimal capacity-independent convergence rates. Furthermore, by introducing a weight clipping technique, we demonstrate that the convergence rates of the weighted spectral algorithm can approach the optimal capacity-dependent convergence rates arbitrarily closely. This improvement resolves the suboptimality issue in unbounded density ratio scenarios and advances the state-of-the-art by refining existing theoretical results.

artificial intelligence, inductive learning, machine learning, (18 more...)

arXiv.org Machine Learning

2504.12625

Country:

Asia > China > Hong Kong (0.04)
Asia > China > Shanghai > Shanghai (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
(3 more...)

Genre: Research Report (0.63)

Industry: Education (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.34)

Add feedback

An Empirically Grounded Identifiability Theory Will Accelerate Self-Supervised Learning Research

Reizinger, Patrik, Balestriero, Randall, Klindt, David, Brendel, Wieland

arXiv.org Machine LearningApr-17-2025

Self-Supervised Learning (SSL) powers many current AI systems. As research interest and investment grow, the SSL design space continues to expand. The Platonic view of SSL, following the Platonic Representation Hypothesis (PRH), suggests that despite different methods and engineering approaches, all representations converge to the same Platonic ideal. However, this phenomenon lacks precise theoretical explanation. By synthesizing evidence from Identifiability Theory (IT), we show that the PRH can emerge in SSL. However, current IT cannot explain SSL's empirical success. To bridge the gap between theory and practice, we propose expanding IT into what we term Singular Identifiability Theory (SITh), a broader theoretical framework encompassing the entire SSL pipeline. SITh would allow deeper insights into the implicit data assumptions in SSL and advance the field towards learning more interpretable and generalizable representations. We highlight three critical directions for future research: 1) training dynamics and convergence properties of SSL; 2) the impact of finite samples, batch size, and data diversity; and 3) the role of inductive biases in architecture, augmentations, initialization schemes, and optimizers.

artificial intelligence, machine learning, natural language, (17 more...)

arXiv.org Machine Learning

2504.13101

Country:

Asia > Japan > Honshū > Tōhoku > Iwate Prefecture > Morioka (0.05)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
North America > United States > Rhode Island (0.04)
(3 more...)

Genre: Research Report > New Finding (0.46)

Industry: Health & Medicine (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
(2 more...)

Add feedback