AITopics | Inductive Learning

Collaborating Authors

Inductive Learning

Inductive learning, or induction, is the process of creating generalizations from individual instances.

News Overviews Instructional Materials AI-Alerts Classics

On Pretraining Data Diversity for Self-Supervised Learning

Hammoud, Hasan Abed Al Kader, Das, Tuhin, Pizzati, Fabio, Torr, Philip, Bibi, Adel, Ghanem, Bernard

arXiv.org Artificial IntelligenceApr-5-2024

We explore the impact of training with more diverse datasets, characterized by the number of unique samples, on the performance of self-supervised learning (SSL) under a fixed computational budget. Our findings consistently demonstrate that increasing pretraining data diversity enhances SSL performance, albeit only when the distribution distance to the downstream data is minimal. Notably, even with an exceptionally large pretraining data diversity achieved through methods like web crawling or diffusion-generated data, among other ways, the distribution shift remains a challenge. Our experiments are comprehensive with seven SSL methods using large-scale datasets such as ImageNet and YFCC100M amounting to over 200 GPU days.

dataset, distribution shift, diversity, (15 more...)

arXiv.org Artificial Intelligence

2403.13808

Country:

Europe > United Kingdom > England > Oxfordshire > Oxford (0.14)
Europe > Poland (0.04)

Genre: Research Report > New Finding (0.48)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.71)

Add feedback

Exploring Probabilistic Models for Semi-supervised Learning

Wang, Jianfeng

arXiv.org Artificial IntelligenceApr-5-2024

This thesis studies advanced probabilistic models, including both their theoretical foundations and practical applications, for different semi-supervised learning (SSL) tasks. The proposed probabilistic methods are able to improve the safety of AI systems in real applications by providing reliable uncertainty estimates quickly, and at the same time, achieve competitive performance compared to their deterministic counterparts. The experimental results indicate that the methods proposed in the thesis have great value in safety-critical areas, such as the autonomous driving or medical imaging analysis domain, and pave the way for the future discovery of highly effective and efficient probabilistic approaches in the SSL sector.

international conference, mc dropout, segmentation, (13 more...)

arXiv.org Artificial Intelligence

2404.04199

Country: Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)

Genre: Research Report > New Finding (0.92)

Industry: Health & Medicine > Diagnostic Medicine > Imaging (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (1.00)

Add feedback

Exploration is Harder than Prediction: Cryptographically Separating Reinforcement Learning from Supervised Learning

Golowich, Noah, Moitra, Ankur, Rohatgi, Dhruv

arXiv.org Artificial IntelligenceApr-4-2024

Supervised learning is often computationally easy in practice. But to what extent does this mean that other modes of learning, such as reinforcement learning (RL), ought to be computationally easy by extension? In this work we show the first cryptographic separation between RL and supervised learning, by exhibiting a class of block MDPs and associated decoding functions where reward-free exploration is provably computationally harder than the associated regression problem. We also show that there is no computationally efficient algorithm for reward-directed RL in block MDPs, even when given access to an oracle for this regression problem. It is known that being able to perform regression in block MDPs is necessary for finding a good policy; our results suggest that it is not sufficient. Our separation lower bound uses a new robustness property of the Learning Parities with Noise (LPN) hardness assumption, which is crucial in handling the dependent nature of RL data. We argue that separations and oracle lower bounds, such as ours, are a more meaningful way to prove hardness of learning because the constructions better reflect the practical reality that supervised learning by itself is often not the computational bottleneck.

algorithm, definition 4, regression, (15 more...)

arXiv.org Artificial Intelligence

2404.03774

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
North America > United States > California > Santa Barbara County > Santa Barbara (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
(3 more...)

Genre:

Workflow (0.93)
Research Report > New Finding (0.67)

Industry:

Leisure & Entertainment (0.45)
Health & Medicine (0.45)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.45)

Add feedback

A Systems Theoretic Approach to Online Machine Learning

Preez, Anli du, Beling, Peter A., Cody, Tyler

arXiv.org Artificial IntelligenceApr-4-2024

The machine learning formulation of online learning is incomplete from a systems theoretic perspective. Typically, machine learning research emphasizes domains and tasks, and a problem solving worldview. It focuses on algorithm parameters, features, and samples, and neglects the perspective offered by considering system structure and system behavior or dynamics. Online learning is an active field of research and has been widely explored in terms of statistical theory and computational algorithms, however, in general, the literature still lacks formal system theoretical frameworks for modeling online learning systems and resolving systems-related concept drift issues. Furthermore, while the machine learning formulation serves to classify methods and literature, the systems theoretic formulation presented herein serves to provide a framework for the top-down design of online learning systems, including a novel definition of online learning and the identification of key design parameters. The framework is formulated in terms of input-output systems and is further divided into system structure and system behavior. Concept drift is a critical challenge faced in online learning, and this work formally approaches it as part of the system behavior characteristics. Healthcare provider fraud detection using machine learning is used as a case study throughout the paper to ground the discussion in a real-world online learning challenge.

ol system, system behavior, system structure, (16 more...)

arXiv.org Artificial Intelligence

2404.03775

Country:

Asia > South Korea > Daegu > Daegu (0.04)
Oceania > New Zealand > North Island > Auckland Region > Auckland (0.04)
North America > United States > Virginia > Montgomery County > Blacksburg (0.04)
(4 more...)

Genre:

Research Report (0.50)
Instructional Material > Online (0.41)

Industry:

Law Enforcement & Public Safety > Fraud (1.00)
Health & Medicine (1.00)
Education (1.00)
Banking & Finance (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.46)

Add feedback

Learning Alternative Ways of Performing a Task

Nieves, David, Ramírez-Quintana, María José, Monserrat, Carlos, Ferri, César, Hernández-Orallo, José

arXiv.org Artificial IntelligenceApr-3-2024

A common way of learning to perform a task is to observe how it is carried out by experts. However, it is well known that for most tasks there is no unique way to perform them. This is especially noticeable the more complex the task is because factors such as the skill or the know-how of the expert may well affect the way she solves the task. In addition, learning from experts also suffers of having a small set of training examples generally coming from several experts (since experts are usually a limited and expensive resource), being all of them positive examples (i.e. examples that represent successful executions of the task). Traditional machine learning techniques are not useful in such scenarios, as they require extensive training data. Starting from very few executions of the task presented as activity sequences, we introduce a novel inductive approach for learning multiple models, with each one representing an alternative strategy of performing a task. By an iterative process based on generalisation and specialisation, we learn the underlying patterns that capture the different styles of performing a task exhibited by the examples. We illustrate our approach on two common activity recognition tasks: a surgical skills training task and a cooking domain. We evaluate the inferred models with respect to two metrics that measure how well the models represent the examples and capture the different forms of executing a task showed by the examples. We compare our results with the traditional process mining approach and show that a small set of meaningful examples is enough to obtain patterns that capture the different strategies that are followed to solve the tasks.

dependency graph, graph, sequence, (17 more...)

arXiv.org Artificial Intelligence

doi: 10.1016/j.eswa.2020.113263

2404.02579

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
North America > Canada > British Columbia > East Kootenay Region > Fernie (0.04)
Europe > Spain > Valencian Community > Valencia Province > Valencia (0.04)
(4 more...)

Genre:

Workflow (0.89)
Research Report > New Finding (0.34)

Industry:

Information Technology (1.00)
Health & Medicine > Surgery (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.86)
Information Technology > Artificial Intelligence > Representation & Reasoning > Expert Systems (0.68)

Add feedback

MCL-GAN: Generative Adversarial Networks with Multiple Specialized Discriminators

Choi, Jinyoung, Han, Bohyung

arXiv.org Artificial IntelligenceApr-3-2024

We propose a framework of generative adversarial networks with multiple discriminators, which collaborate to represent a real dataset more effectively. Our approach facilitates learning a generator consistent with the underlying data distribution based on real images and thus mitigates the chronic mode collapse problem. From the inspiration of multiple choice learning, we guide each discriminator to have expertise in a subset of the entire data and allow the generator to find reasonable correspondences between the latent and real data spaces automatically without extra supervision for training examples. Despite the use of multiple discriminators, the backbone networks are shared across the discriminators and the increase in training cost is marginal. We demonstrate the effectiveness of our algorithm using multiple evaluation metrics in the standard datasets for diverse tasks.

generative adversarial network, mcl-gan, multiple specialized discriminator, (9 more...)

arXiv.org Artificial Intelligence

2107.0726

Country: Europe > United Kingdom (0.04)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Unsupervised or Indirectly Supervised Learning (0.60)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.60)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.53)

Add feedback

Weakly-supervised Audio Separation via Bi-modal Semantic Similarity

Mahmud, Tanvir, Amizadeh, Saeed, Koishida, Kazuhito, Marculescu, Diana

arXiv.org Artificial IntelligenceApr-2-2024

Conditional sound separation in multi-source audio mixtures without having access to single source sound data during training is a long standing challenge. Existing mix-and-separate based methods suffer from significant performance drop with multi-source training mixtures due to the lack of supervision signal for single source separation cases during training. However, in the case of language-conditional audio separation, we do have access to corresponding text descriptions for each audio mixture in our training data, which can be seen as (rough) representations of the audio samples in the language modality. To this end, in this paper, we propose a generic bi-modal separation framework which can enhance the existing unsupervised frameworks to separate single-source signals in a target modality (i.e., audio) using the easily separable corresponding signals in the conditioning modality (i.e., language), without having access to single-source samples in the target modality during training. We empirically show that this is well within reach if we have access to a pretrained joint embedding model between the two modalities (i.e., CLAP). Furthermore, we propose to incorporate our framework into two fundamental scenarios to enhance separation performance. First, we show that our proposed methodology significantly improves the performance of purely unsupervised baselines by reducing the distribution shift between training and test samples. In particular, we show that our framework can achieve 71% boost in terms of Signal-to-Distortion Ratio (SDR) over the baseline, reaching 97.5% of the supervised learning performance. Second, we show that we can further improve the performance of the supervised learning itself by 17% if we augment it by our proposed weakly-supervised framework, that enables a powerful semi-supervised framework for audio separation.

dataset, separation, supervised training, (15 more...)

arXiv.org Artificial Intelligence

2404.0174

Country: North America > United States > Texas > Travis County > Austin (0.04)

Genre: Research Report > New Finding (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.82)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.69)

Add feedback

Using Interpretation Methods for Model Enhancement

Chen, Zhuo, Jiang, Chengyue, Tu, Kewei

arXiv.org Artificial IntelligenceApr-2-2024

In the age of neural natural language processing, there are plenty of works trying to derive interpretations of neural models. Intuitively, when gold rationales exist during training, one can additionally train the model to match its interpretation with the rationales. However, this intuitive idea has not been fully explored. In this paper, we propose a framework of utilizing interpretation methods and gold rationales to enhance models. Our framework is very general in the sense that it can incorporate various interpretation methods. Previously proposed gradient-based methods can be shown as an instance of our framework. We also propose two novel instances utilizing two other types of interpretation methods, erasure/replace-based and extractor-based methods, for model enhancement. We conduct comprehensive experiments on a variety of tasks. Experimental results show that our framework is effective especially in low-resource settings in enhancing models with various interpretation methods, and our two newly-proposed methods outperform gradient-based methods in most settings. Code is available at https://github.com/Chord-Chen-30/UIMER.

attribution score, interpretation method, rationale, (14 more...)

arXiv.org Artificial Intelligence

2404.02068

Country: Asia > China > Shanghai > Shanghai (0.04)

Genre: Research Report > New Finding (0.34)

Industry:

Leisure & Entertainment (1.00)
Media > Film (0.67)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.54)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.46)

Add feedback

BEM: Balanced and Entropy-based Mix for Long-Tailed Semi-Supervised Learning

Zheng, Hongwei, Zhou, Linyuan, Li, Han, Su, Jinming, Wei, Xiaoming, Xu, Xiaoming

arXiv.org Artificial IntelligenceApr-1-2024

Data mixing methods play a crucial role in semi-supervised learning (SSL), but their application is unexplored in long-tailed semi-supervised learning (LTSSL). The primary reason is that the in-batch mixing manner fails to address class imbalance. Furthermore, existing LTSSL methods mainly focus on re-balancing data quantity but ignore class-wise uncertainty, which is also vital for class balance. For instance, some classes with sufficient samples might still exhibit high uncertainty due to indistinguishable features. To this end, this paper introduces the Balanced and Entropy-based Mix (BEM), a pioneering mixing approach to re-balance the class distribution of both data quantity and uncertainty. Specifically, we first propose a class balanced mix bank to store data of each class for mixing. This bank samples data based on the estimated quantity distribution, thus re-balancing data quantity. Then, we present an entropy-based learning approach to re-balance class-wise uncertainty, including entropy-based sampling strategy, entropy-based selection module, and entropy-based class balanced loss. Our BEM first leverages data mixing for improving LTSSL, and it can also serve as a complement to the existing re-balancing methods. Experimental results show that BEM significantly enhances various LTSSL frameworks and achieves state-of-the-art performances across multiple benchmarks.

bem, class distribution, data quantity, (16 more...)

arXiv.org Artificial Intelligence

2404.01179

Country:

North America > United States > California > Santa Clara County > Palo Alto (0.04)
North America > Canada > Ontario > Toronto (0.04)
Asia > China > Shanghai > Shanghai (0.04)

Genre: Research Report > New Finding (0.87)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Unsupervised or Indirectly Supervised Learning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.92)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)

Add feedback

Multi-Review Fusion-in-Context

Slobodkin, Aviv, Shapira, Ori, Levy, Ran, Dagan, Ido

arXiv.org Artificial IntelligenceMar-31-2024

Grounded text generation, encompassing tasks such as long-form question-answering and summarization, necessitates both content selection and content consolidation. Current end-to-end methods are difficult to control and interpret due to their opaqueness. Accordingly, recent works have proposed a modular approach, with separate components for each step. Specifically, we focus on the second subtask, of generating coherent text given pre-selected content in a multi-document setting. Concretely, we formalize Fusion-in-Context (FiC) as a standalone task, whose input consists of source texts with highlighted spans of targeted content. A model then needs to generate a coherent passage that includes all and only the target information. Our work includes the development of a curated dataset of 1000 instances in the reviews domain, alongside a novel evaluation framework for assessing the faithfulness and coverage of highlights, which strongly correlate to human judgment. Several baseline models exhibit promising outcomes and provide insightful analyses. This study lays the groundwork for further exploration of modular text generation in the multi-document setting, offering potential improvements in the quality and reliability of generated content. Our benchmark, FuseReviews, including the dataset, evaluation framework, and designated leaderboard, can be found at https://fusereviews.github.io/.

computational linguistic, dataset, summarization, (14 more...)

arXiv.org Artificial Intelligence

2403.15351

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Asia > Singapore (0.04)
North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
(13 more...)

Genre: Research Report (1.00)

Industry: Retail (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.67)

Add feedback