AITopics | Inductive Learning

Collaborating Authors

Inductive Learning

Inductive learning, or induction, is the process of creating generalizations from individual instances.

News Overviews Instructional Materials AI-Alerts Classics

Feature-Aware Noise Contrastive Learning For Unsupervised Red Panda Re-Identification

arXiv.org Artificial IntelligenceMay-1-2024

To facilitate the re-identification (Re-ID) of individual animals, existing methods primarily focus on maximizing feature similarity within the same individual and enhancing distinctiveness between different individuals. However, most of them still rely on supervised learning and require substantial labeled data, which is challenging to obtain. To avoid this issue, we propose a Feature-Aware Noise Contrastive Learning (FANCL) method to explore an unsupervised learning solution, which is then validated on the task of red panda re-ID. FANCL employs a Feature-Aware Noise Addition module to produce noised images that conceal critical features and designs two contrastive learning modules to calculate the losses. Firstly, a feature consistency module is designed to bridge the gap between the original and noised features. Secondly, the neural networks are trained through a cluster contrastive learning module. Through these more challenging learning tasks, FANCL can adaptively extract deeper representations of red pandas. The experimental results on a set of red panda images collected in both indoor and outdoor environments prove that FANCL outperforms several related state-of-the-art unsupervised methods, achieving high performance comparable to supervised learning methods.

proceedings, recognition, red panda, (13 more...)

arXiv.org Artificial Intelligence

2405.00468

Country:

Asia > China > Sichuan Province > Chengdu (0.04)
North America > United States > California > Alameda County > Oakland (0.04)
Asia > China > Tibet Autonomous Region (0.04)
(2 more...)

Genre:

Research Report (0.82)
Instructional Material > Course Syllabus & Notes (0.45)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.75)

Add feedback

Robust Semi-supervised Learning via $f$-Divergence and $\alpha$-R\'enyi Divergence

Aminian, Gholamali, Bagheri, Amirhossien, JafariNodeh, Mahyar, Karimian, Radmehr, Yassaee, Mohammad-Hossein

arXiv.org Machine LearningMay-1-2024

This paper investigates a range of empirical risk functions and regularization methods suitable for self-training methods in semi-supervised learning. These approaches draw inspiration from various divergence measures, such as $f$-divergences and $\alpha$-R\'enyi divergences. Inspired by the theoretical foundations rooted in divergences, i.e., $f$-divergences and $\alpha$-R\'enyi divergence, we also provide valuable insights to enhance the understanding of our empirical risk functions and regularization techniques. In the pseudo-labeling and entropy minimization techniques as self-training methods for effective semi-supervised learning, the self-training process has some inherent mismatch between the true label and pseudo-label (noisy pseudo-labels) and some of our empirical risk functions are robust, concerning noisy pseudo-labels. Under some conditions, our empirical risk functions demonstrate better performance when compared to traditional self-training methods.

data sample, divergence, scenario, (15 more...)

arXiv.org Machine Learning

2405.00454

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
Europe > United Kingdom > England > Greater London > London (0.04)
Asia > Middle East > Iran (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Unsupervised or Indirectly Supervised Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.69)

Add feedback

On Improving the Algorithm-, Model-, and Data- Efficiency of Self-Supervised Learning

Cao, Yun-Hao, Wu, Jianxin

arXiv.org Artificial IntelligenceApr-30-2024

Self-supervised learning (SSL) has developed rapidly in recent years. However, most of the mainstream methods are computationally expensive and rely on two (or more) augmentations for each image to construct positive pairs. Moreover, they mainly focus on large models and large-scale datasets, which lack flexibility and feasibility in many practical applications. In this paper, we propose an efficient single-branch SSL method based on non-parametric instance discrimination, aiming to improve the algorithm, model, and data efficiency of SSL. By analyzing the gradient formula, we correct the update rule of the memory bank with improved performance. We further propose a novel self-distillation loss that minimizes the KL divergence between the probability distribution and its square root version. We show that this alleviates the infrequent updating problem in instance discrimination and greatly accelerates convergence. We systematically compare the training overhead and performance of different methods in different scales of data, and under different backbones. Experimental results show that our method outperforms various baselines with significantly less overhead, and is especially effective for limited amounts of data and small models.

computer vision, efficiency, representation, (14 more...)

arXiv.org Artificial Intelligence

2404.19289

Country:

North America > United States > Indiana > Marion County > Lawrence (0.04)
North America > Canada > Ontario > Toronto (0.04)
Asia > China > Jiangsu Province > Nanjing (0.04)

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.61)

Add feedback

Semi-Supervised Hierarchical Multi-Label Classifier Based on Local Information

Serrano-Pérez, Jonathan, Sucar, L. Enrique

arXiv.org Artificial IntelligenceApr-30-2024

Scarcity of labeled data is a common problem in supervised classification, since hand-labeling can be time consuming, expensive or hard to label; on the other hand, large amounts of unlabeled information can be found. The problem of scarcity of labeled data is even more notorious in hierarchical classification, because the data of a node is split among its children, which results in few instances associated to the deepest nodes of the hierarchy. In this work it is proposed the semi-supervised hierarchical multi-label classifier based on local information (SSHMC-BLI) which can be trained with labeled and unlabeled data to perform hierarchical classification tasks. The method can be applied to any type of hierarchical problem, here we focus on the most difficult case: hierarchies of DAG type, where the instances can be associated to multiple paths of labels which can finish in an internal node. SSHMC-BLI builds pseudo-labels for each unlabeled instance from the paths of labels of its labeled neighbors, while it considers whether the unlabeled instance is similar to its neighbors. Experiments on 12 challenging datasets from functional genomics show that making use of unlabeled along with labeled data can help to improve the performance of a supervised hierarchical classifier trained only on labeled data, even with statistical significance.

classification, classifier, hierarchy, (16 more...)

arXiv.org Artificial Intelligence

2405.00184

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
North America > United States > Wisconsin > Dane County > Madison (0.04)
North America > United States > New York > New York County > New York City (0.04)
(3 more...)

Genre: Research Report > Experimental Study (0.48)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Unsupervised or Indirectly Supervised Learning (0.52)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.48)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis (0.47)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.46)

Add feedback

Foundations of Multisensory Artificial Intelligence

Liang, Paul Pu

arXiv.org Artificial IntelligenceApr-29-2024

Building multisensory AI systems that learn from multiple sensory inputs such as text, speech, video, real-world sensors, wearable devices, and medical data holds great promise for impact in many scientific areas with practical benefits, such as in supporting human health and well-being, enabling multimedia content processing, and enhancing real-world autonomous agents. By synthesizing a range of theoretical frameworks and application domains, this thesis aims to advance the machine learning foundations of multisensory AI. In the first part, we present a theoretical framework formalizing how modalities interact with each other to give rise to new information for a task. These interactions are the basic building blocks in all multimodal problems, and their quantification enables users to understand their multimodal datasets, design principled approaches to learn these interactions, and analyze whether their model has succeeded in learning. In the second part, we study the design of practical multimodal foundation models that generalize over many modalities and tasks, which presents a step toward grounding large language models to real-world sensory modalities. We introduce MultiBench, a unified large-scale benchmark across a wide range of modalities, tasks, and research areas, followed by the cross-modal attention and multimodal transformer architectures that now underpin many of today's multimodal foundation models. Scaling these architectures on MultiBench enables the creation of general-purpose multisensory AI systems, and we discuss our collaborative efforts in applying these models for real-world impact in affective computing, mental health, cancer prognosis, and robotics. Finally, we conclude this thesis by discussing how future work can leverage these ideas toward more general, interactive, and safe multisensory AI.

introduce optimization and generalization error, multimodal interaction and information theory, task-relevant and remove task-irrelevant information, (17 more...)

arXiv.org Artificial Intelligence

2404.18976

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.13)
North America > United States > California > San Francisco County > San Francisco (0.13)
North America > United States > New York > New York County > New York City (0.04)
(19 more...)

Genre:

Research Report > New Finding (1.00)
Overview (1.00)
Instructional Material > Course Syllabus & Notes (0.67)

Industry:

Health & Medicine > Therapeutic Area > Psychiatry/Psychology (1.00)
Health & Medicine > Therapeutic Area > Oncology (1.00)
Health & Medicine > Health Care Technology (1.00)
(3 more...)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Human Computer Interaction > Interfaces (1.00)
Information Technology > Data Science > Data Mining (1.00)
(19 more...)

Add feedback

Revealing the Parametric Knowledge of Language Models: A Unified Framework for Attribution Methods

Yu, Haeun, Atanasova, Pepa, Augenstein, Isabelle

arXiv.org Artificial IntelligenceApr-29-2024

Language Models (LMs) acquire parametric knowledge from their training process, embedding it within their weights. The increasing scalability of LMs, however, poses significant challenges for understanding a model's inner workings and further for updating or correcting this embedded knowledge without the significant cost of retraining. This underscores the importance of unveiling exactly what knowledge is stored and its association with specific model components. Instance Attribution (IA) and Neuron Attribution (NA) offer insights into this training-acquired knowledge, though they have not been compared systematically. Our study introduces a novel evaluation framework to quantify and compare the knowledge revealed by IA and NA. To align the results of the methods we introduce the attribution method NA-Instances to apply NA for retrieving influential training instances, and IA-Neurons to discover important neurons of influential instances discovered by IA. We further propose a comprehensive list of faithfulness tests to evaluate the comprehensiveness and sufficiency of the explanations provided by both methods. Through extensive experiments and analysis, we demonstrate that NA generally reveals more diverse and comprehensive information regarding the LM's parametric knowledge compared to IA. Nevertheless, IA provides unique and valuable insights into the LM's parametric knowledge, which are not revealed by NA. Our findings further suggest the potential of a synergistic approach of combining the diverse findings of IA and NA for a more holistic understanding of an LM's parametric knowledge.

dataset, knowledge, neuron, (13 more...)

arXiv.org Artificial Intelligence

2404.18655

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Asia > Singapore (0.04)
Europe > Ireland > Leinster > County Dublin > Dublin (0.04)
(5 more...)

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.71)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)

Add feedback

Computational Job Market Analysis with Natural Language Processing

Zhang, Mike

arXiv.org Artificial IntelligenceApr-29-2024

[Abridged Abstract] Recent technological advances underscore labor market dynamics, yielding significant consequences for employment prospects and increasing job vacancy data across platforms and languages. Aggregating such data holds potential for valuable insights into labor market demands, new skills emergence, and facilitating job matching for various stakeholders. However, despite prevalent insights in the private sector, transparent language technology systems and data for this domain are lacking. This thesis investigates Natural Language Processing (NLP) technology for extracting relevant information from job descriptions, identifying challenges including scarcity of training data, lack of standardized annotation guidelines, and shortage of effective extraction methods from job ads. We frame the problem, obtaining annotated data, and introducing extraction methodologies. Our contributions include job description datasets, a de-identification dataset, and a novel active learning algorithm for efficient model training. We propose skill extraction using weak supervision, a taxonomy-aware pre-training methodology adapting multilingual language models to the job market domain, and a retrieval-augmented model leveraging multiple skill extraction datasets to enhance overall performance. Finally, we ground extracted information within a designated taxonomy.

computational job market analysis, nearest neighbor occupational skill extraction, qualification and occupation taxonomy, (17 more...)

arXiv.org Artificial Intelligence

2404.18977

Country:

North America > United States > California > San Francisco County > San Francisco (0.27)
Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.27)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
(47 more...)

Genre:

Research Report > New Finding (1.00)
Overview (1.00)
Instructional Material (0.92)
Research Report > Experimental Study (0.92)

Industry:

Law (1.00)
Information Technology > Security & Privacy (1.00)
Banking & Finance > Economy (0.68)
(3 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
(7 more...)

Add feedback

One-Shot Image Restoration

Pereg, Deborah

arXiv.org Artificial IntelligenceApr-26-2024

Image restoration, or inverse problems in image processing, has long been an extensively studied topic. In recent years supervised learning approaches have become a popular strategy attempting to tackle this task. Unfortunately, most supervised learning-based methods are highly demanding in terms of computational resources and training data (sample complexity). In addition, trained models are sensitive to domain changes, such as varying acquisition systems, signal sampling rates, resolution and contrast. In this work, we try to answer a fundamental question: Can supervised learning models generalize well solely by learning from one image or even part of an image? If so, then what is the minimal amount of patches required to achieve acceptable generalization? To this end, we focus on an efficient patch-based learning framework that requires a single image input-output pair for training. Experimental results demonstrate the applicability, robustness and computational efficiency of the proposed approach for supervised image deblurring and super-resolution. Our results showcase significant improvement of learning models' sample efficiency, generalization and time complexity, that can hopefully be leveraged for future real-time applications, and applied to other signals and modalities.

artificial intelligence, inductive learning, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2404.17426

Country: North America > United States (0.46)

Genre: Research Report > New Finding (0.54)

Industry: Energy > Oil & Gas > Upstream (0.68)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (1.00)

Add feedback

Mining patterns in syntax trees to automate code reviews of student solutions for programming exercises

Van Petegem, Charlotte, Demeyere, Kasper, Maertens, Rien, Strijbol, Niko, De Wever, Bram, Mesuere, Bart, Dawyndt, Peter

arXiv.org Artificial IntelligenceApr-26-2024

In programming education, providing manual feedback is essential but labour-intensive, posing challenges in consistency and timeliness. We introduce ECHO, a machine learning method to automate the reuse of feedback in educational code reviews by analysing patterns in abstract syntax trees. This study investigates two primary questions: whether ECHO can predict feedback annotations to specific lines of student code based on previously added annotations by human reviewers (RQ1), and whether its training and prediction speeds are suitable for using ECHO for real-time feedback during live code reviews by human reviewers (RQ2). Our results, based on annotations from both automated linting tools and human reviewers, show that ECHO can accurately and quickly predict appropriate feedback annotations. Its efficiency in processing and its flexibility in adapting to feedback patterns can significantly reduce the time and effort required for manual feedback provisioning in educational settings.

annotation, submission, subtree, (15 more...)

arXiv.org Artificial Intelligence

2405.01579

Country:

North America > United States > New York > New York County > New York City (0.04)
Europe > Belgium > Flanders (0.04)

Genre:

Research Report > New Finding (1.00)
Instructional Material (1.00)

Industry: Education > Educational Setting (0.88)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Pattern Recognition (0.69)
Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (0.61)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.48)

Add feedback

Rethinking The Uniformity Metric in Self-Supervised Learning

Fang, Xianghong, Li, Jian, Sun, Qiang, Wang, Benyou

arXiv.org Artificial IntelligenceApr-26-2024

Uniformity plays an important role in evaluating learned representations, providing insights into self-supervised learning. In our quest for effective uniformity metrics, we pinpoint four principled properties that such metrics should possess. Namely, an effective uniformity metric should remain invariant to instance permutations and sample replications while accurately capturing feature redundancy and dimensional collapse. Surprisingly, we find that the uniformity metric proposed by \citet{Wang2020UnderstandingCR} fails to satisfy the majority of these properties. Specifically, their metric is sensitive to sample replications, and can not account for feature redundancy and dimensional collapse correctly. To overcome these limitations, we introduce a new uniformity metric based on the Wasserstein distance, which satisfies all the aforementioned properties. Integrating this new metric in existing self-supervised learning methods effectively mitigates dimensional collapse and consistently improves their performance on downstream tasks involving CIFAR-10 and CIFAR-100 datasets. Code is available at \url{https://github.com/statsle/WassersteinSSL}.

dimensional collapse, representation, uniformity metric, (14 more...)

arXiv.org Artificial Intelligence

2403.00642

Country:

North America > Canada > Ontario > Toronto (0.14)
Asia > China > Guangdong Province > Shenzhen (0.04)
Asia > China > Hong Kong (0.04)
Asia > India > West Bengal > Kolkata (0.04)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.67)

Add feedback