AITopics | Inductive Learning

Collaborating Authors

Inductive Learning

Inductive learning, or induction, is the process of creating generalizations from individual instances.

News Overviews Instructional Materials AI-Alerts Classics

Describe Anything: Detailed Localized Image and Video Captioning

Lian, Long, Ding, Yifan, Ge, Yunhao, Liu, Sifei, Mao, Hanzi, Li, Boyi, Pavone, Marco, Liu, Ming-Yu, Darrell, Trevor, Yala, Adam, Cui, Yin

arXiv.org Artificial IntelligenceApr-23-2025

Generating detailed and accurate descriptions for specific regions in images and videos remains a fundamental challenge for vision-language models. We introduce the Describe Anything Model (DAM), a model designed for detailed localized captioning (DLC). DAM preserves both local details and global context through two key innovations: a focal prompt, which ensures high-resolution encoding of targeted regions, and a localized vision backbone, which integrates precise localization with its broader context. To tackle the scarcity of high-quality DLC data, we propose a Semi-supervised learning (SSL)-based Data Pipeline (DLC-SDP). DLC-SDP starts with existing segmentation datasets and expands to unlabeled web images using SSL. We introduce DLC-Bench, a benchmark designed to evaluate DLC without relying on reference captions. DAM sets new state-of-the-art on 7 benchmarks spanning keyword-level, phrase-level, and detailed multi-sentence localized image and video captioning.

caption, large language model, machine learning, (20 more...)

arXiv.org Artificial Intelligence

2504.16072

Genre: Research Report (0.40)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
(3 more...)

Add feedback

Invariant Learning with Annotation-free Environments

Le, Phuong Quynh, Seifert, Christin, Schlötterer, Jörg

arXiv.org Artificial IntelligenceApr-23-2025

Invariant learning is a promising approach to improve domain generalization compared to Empirical Risk Minimization (ERM). However, most invariant learning methods rely on the assumption that training examples are pre-partitioned into different known environments. We instead infer environments without the need for additional annotations, motivated by observations of the properties within the representation space of a trained ERM model. We show the preliminary effectiveness of our approach on the ColoredMNIST benchmark, achieving performance comparable to methods requiring explicit environment labels and on par with an annotation-free method that poses strong restrictions on the ERM reference model.

artificial intelligence, inductive learning, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2504.15686

Genre: Research Report (0.70)

Industry: Health & Medicine > Therapeutic Area (0.48)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.68)

Add feedback

Locate 3D: Real-World Object Localization via Self-Supervised Learning in 3D

Arnaud, Sergio, McVay, Paul, Martin, Ada, Majumdar, Arjun, Jatavallabhula, Krishna Murthy, Thomas, Phillip, Partsey, Ruslan, Dugas, Daniel, Gejji, Abha, Sax, Alexander, Berges, Vincent-Pierre, Henaff, Mikael, Jain, Ayush, Cao, Ang, Prasad, Ishita, Kalakrishnan, Mrinal, Rabbat, Michael, Ballas, Nicolas, Assran, Mido, Maksymets, Oleksandr, Rajeswaran, Aravind, Meier, Franziska

arXiv.org Artificial IntelligenceApr-22-2025

We present LOCATE 3D, a model for localizing objects in 3D scenes from referring expressions like "the small coffee table between the sofa and the lamp." LOCATE 3D sets a new state-of-the-art on standard referential grounding benchmarks and showcases robust generalization capabilities. Notably, LOCATE 3D operates directly on sensor observation streams (posed RGB-D frames), enabling real-world deployment on robots and AR devices. Key to our approach is 3D-JEPA, a novel self-supervised learning (SSL) algorithm applicable to sensor point clouds. It takes as input a 3D pointcloud featurized using 2D foundation models (CLIP, DINO). Subsequently, masked prediction in latent space is employed as a pretext task to aid the self-supervised learning of contextualized pointcloud features. Once trained, the 3D-JEPA encoder is finetuned alongside a language-conditioned decoder to jointly predict 3D masks and bounding boxes. Additionally, we introduce LOCATE 3D DATASET, a new dataset for 3D referential grounding, spanning multiple capture setups with over 130K annotations. This enables a systematic study of generalization capabilities as well as a stronger model.

large language model, machine learning, natural language, (22 more...)

arXiv.org Artificial Intelligence

2504.14151

Country: Asia (0.28)

Genre: Research Report > New Finding (1.00)

Industry:

Appliances & Durable Goods (0.34)
Information Technology (0.34)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.90)

Add feedback

Spectral Algorithms under Covariate Shift

Fan, Jun, Guo, Zheng-Chu, Shi, Lei

arXiv.org Machine LearningApr-17-2025

Spectral algorithms leverage spectral regularization techniques to analyze and process data, providing a flexible framework for addressing supervised learning problems. To deepen our understanding of their performance in real-world scenarios where the distributions of training and test data may differ, we conduct a rigorous investigation into the convergence behavior of spectral algorithms under distribution shifts, specifically within the framework of reproducing kernel Hilbert spaces. Our study focuses on the case of covariate shift. In this scenario, the marginal distributions of the input data differ between the training and test datasets, while the conditional distribution of the output given the input remains unchanged. Under this setting, we analyze the generalization error of spectral algorithms and show that they achieve minimax optimality when the density ratios between the training and test distributions are uniformly bounded. However, we also identify a critical limitation: when the density ratios are unbounded, the spectral algorithms may become suboptimal. To address this limitation, we propose a weighted spectral algorithm that incorporates density ratio information into the learning process. Our theoretical analysis shows that this weighted approach achieves optimal capacity-independent convergence rates. Furthermore, by introducing a weight clipping technique, we demonstrate that the convergence rates of the weighted spectral algorithm can approach the optimal capacity-dependent convergence rates arbitrarily closely. This improvement resolves the suboptimality issue in unbounded density ratio scenarios and advances the state-of-the-art by refining existing theoretical results.

artificial intelligence, inductive learning, machine learning, (18 more...)

arXiv.org Machine Learning

2504.12625

Country:

Asia > China > Hong Kong (0.04)
Asia > China > Shanghai > Shanghai (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
(3 more...)

Genre: Research Report (0.63)

Industry: Education (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.34)

Add feedback

An Empirically Grounded Identifiability Theory Will Accelerate Self-Supervised Learning Research

Reizinger, Patrik, Balestriero, Randall, Klindt, David, Brendel, Wieland

arXiv.org Machine LearningApr-17-2025

Self-Supervised Learning (SSL) powers many current AI systems. As research interest and investment grow, the SSL design space continues to expand. The Platonic view of SSL, following the Platonic Representation Hypothesis (PRH), suggests that despite different methods and engineering approaches, all representations converge to the same Platonic ideal. However, this phenomenon lacks precise theoretical explanation. By synthesizing evidence from Identifiability Theory (IT), we show that the PRH can emerge in SSL. However, current IT cannot explain SSL's empirical success. To bridge the gap between theory and practice, we propose expanding IT into what we term Singular Identifiability Theory (SITh), a broader theoretical framework encompassing the entire SSL pipeline. SITh would allow deeper insights into the implicit data assumptions in SSL and advance the field towards learning more interpretable and generalizable representations. We highlight three critical directions for future research: 1) training dynamics and convergence properties of SSL; 2) the impact of finite samples, batch size, and data diversity; and 3) the role of inductive biases in architecture, augmentations, initialization schemes, and optimizers.

artificial intelligence, machine learning, natural language, (17 more...)

arXiv.org Machine Learning

2504.13101

Country:

Asia > Japan > Honshū > Tōhoku > Iwate Prefecture > Morioka (0.05)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
North America > United States > Rhode Island (0.04)
(3 more...)

Genre: Research Report > New Finding (0.46)

Industry: Health & Medicine (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
(2 more...)

Add feedback

Balancing Graph Embedding Smoothness in Self-Supervised Learning via Information-Theoretic Decomposition

Jung, Heesoo, Park, Hogun

arXiv.org Artificial IntelligenceApr-17-2025

Self-supervised learning (SSL) in graphs has garnered significant attention, particularly in employing Graph Neural Networks (GNNs) with pretext tasks initially designed for other domains, such as contrastive learning and feature reconstruction. However, it remains uncertain whether these methods effectively reflect essential graph properties, precisely representation similarity with its neighbors. We observe that existing methods position opposite ends of a spectrum driven by the graph embedding smoothness, with each end corresponding to outperformance on specific downstream tasks. Decomposing the SSL objective into three terms via an information-theoretic framework with a neighbor representation variable reveals that this polarization stems from an imbalance among the terms, which existing methods may not effectively maintain. Further insights suggest that balancing between the extremes can lead to improved performance across a wider range of downstream tasks. A framework, BSG (Balancing Smoothness in Graph SSL), introduces novel loss functions designed to supplement the representation quality in graph-based SSL by balancing the derived three terms: neighbor loss, minimal loss, and divergence loss. We present a theoretical analysis of the effects of these loss functions, highlighting their significance from both the SSL and graph smoothness perspectives. Extensive experiments on multiple real-world datasets across node classification and link prediction consistently demonstrate that BSG achieves state-of-the-art performance, outperforming existing methods. Our implementation code is available at https://github.com/steve30572/BSG.

artificial intelligence, machine learning, representation, (18 more...)

arXiv.org Artificial Intelligence

doi: 10.1145/3696410.3714611

2504.12011

Country: Oceania > Australia (0.16)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.62)

Add feedback

Independence Is Not an Issue in Neurosymbolic AI

Faronius, Håkan Karlsson, Martires, Pedro Zuidberg Dos

arXiv.org Artificial IntelligenceApr-17-2025

A popular approach to neurosymbolic AI is to take the output of the last layer of a neural network, e.g. a softmax activation, and pass it through a sparse computation graph encoding certain logical constraints one wishes to enforce. This induces a probability distribution over a set of random variables, which happen to be conditionally independent of each other in many commonly used neurosymbolic AI models. Such conditionally independent random variables have been deemed harmful as their presence has been observed to co-occur with a phenomenon dubbed deterministic bias, where systems learn to determinis-tically prefer one of the valid solutions from the solution space over the others. We provide evidence contesting this conclusion and show that the phenomenon of deterministic bias is an artifact of improperly applying neurosymbolic AI. Keywords: neurosymbolic AI partial label learning 1 Introduction Neurosymbolic (NeSy) AI is an approach to AI which seeks to combine logic and neural networks [13].

artificial intelligence, machine learning, semantic loss, (15 more...)

arXiv.org Artificial Intelligence

2504.07851

Country: Europe (0.46)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.90)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.69)

Add feedback

Watch: New speed climbing record set in the Swiss Alps

BBC NewsApr-16-2025, 01:45:20 GMT

A Swiss and Austrian climbing pair have shattered the speed record for completing the daunting north faces of a famed trio of Swiss mountains - the Eiger, Mönch and Jungfrau. Switzerland's Nicolas Hojac and Austria's Philipp Brugger shaved nearly ten hours off the previous record set more than two decades ago.

new speed, swiss alp

BBC News

Country:

Europe > Switzerland (0.43)
Europe > Austria (0.43)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Supervised Learning (0.79)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.79)

Add feedback

Respiratory Inhaler Sound Event Classification Using Self-Supervised Learning

Panah, Davoud Shariat, Franciosi, Alessandro N, McCarthy, Cormac, Hines, Andrew

arXiv.org Artificial IntelligenceApr-16-2025

Asthma is a chronic respiratory condition that affects millions of people worldwide. While this condition can be managed by administering controller medications through handheld inhalers, clinical studies have shown low adherence to the correct inhaler usage technique. Consequently, many patients may not receive the full benefit of their medication. Automated classification of inhaler sounds has recently been studied to assess medication adherence. However, the existing classification models were typically trained using data from specific inhaler types, and their ability to generalize to sounds from different inhalers remains unexplored. In this study, we adapted the wav2vec 2.0 self-supervised learning model for inhaler sound classification by pre-training and fine-tuning this model on inhaler sounds. The proposed model shows a balanced accuracy of 98% on a dataset collected using a dry powder inhaler and smartwatch device. The results also demonstrate that re-finetuning this model on minimal data from a target inhaler is a promising approach to adapting a generic inhaler sound classification model to a different inhaler device and audio capture hardware. This is the first study in the field to demonstrate the potential of smartwatches as assistive technologies for the personalized monitoring of inhaler adherence using machine learning models.

artificial intelligence, inductive learning, machine learning, (15 more...)

arXiv.org Artificial Intelligence

2504.11246

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Health & Medicine > Therapeutic Area > Pulmonary/Respiratory Diseases > Asthma (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.71)

Add feedback

Bipartite Ranking From Multiple Labels: On Loss Versus Label Aggregation

Lukasik, Michal, Chen, Lin, Narasimhan, Harikrishna, Menon, Aditya Krishna, Jitkrittum, Wittawat, Yu, Felix X., Reddi, Sashank J., Fu, Gang, Bateni, Mohammadhossein, Kumar, Sanjiv

arXiv.org Machine LearningApr-15-2025

Bipartite ranking is a fundamental supervised learning problem, with the goal of learning a ranking over instances with maximal area under the ROC curve (AUC) against a single binary target label. However, one may often observe multiple binary target labels, e.g., from distinct human annotators. How can one synthesize such labels into a single coherent ranking? In this work, we formally analyze two approaches to this problem -- loss aggregation and label aggregation -- by characterizing their Bayes-optimal solutions. Based on this, we show that while both methods can yield Pareto-optimal solutions, loss aggregation can exhibit label dictatorship: one can inadvertently (and undesirably) favor one label over others. This suggests that label aggregation can be preferable to loss aggregation, which we empirically verify.

aggregation, artificial intelligence, machine learning, (19 more...)

arXiv.org Machine Learning

2504.11284

Country:

North America > United States > Wisconsin > Dane County > Madison (0.04)
North America > United States > New Jersey > Mercer County > Princeton (0.04)
North America > United States > Massachusetts > Suffolk County > Boston (0.04)
(4 more...)

Genre: Research Report (0.40)

Industry: Education (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.55)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.48)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.34)

Add feedback