AITopics | Inductive Learning

Collaborating Authors

Inductive Learning

Inductive learning, or induction, is the process of creating generalizations from individual instances.

News Overviews Instructional Materials AI-Alerts Classics

SSMBA: Self-Supervised Manifold Based Data Augmentation for Improving Out-of-Domain Robustness

Ng, Nathan, Cho, Kyunghyun, Ghassemi, Marzyeh

arXiv.org Machine LearningOct-4-2020

Models that perform well on a training domain often fail to generalize to out-of-domain (OOD) examples. Data augmentation is a common method used to prevent overfitting and improve OOD generalization. However, in natural language, it is difficult to generate new examples that stay on the underlying data manifold. We introduce SSMBA, a data augmentation method for generating synthetic training examples by using a pair of corruption and reconstruction functions to move randomly on a data manifold. We investigate the use of SSMBA in the natural language domain, leveraging the manifold assumption to reconstruct corrupted text with masked language models. In experiments on robustness benchmarks across 3 tasks and 9 datasets, SSMBA consistently outperforms existing data augmentation methods and baseline models on both in-domain and OOD data, achieving gains of 0.8% accuracy on OOD Amazon reviews, 1.8% accuracy on OOD MNLI, and 1.4 BLEU on in-domain IWSLT14 German-English.

artificial intelligence, machine learning, natural language, (19 more...)

arXiv.org Machine Learning

2009.10195

Country:

North America > Canada > Ontario > Toronto (0.14)
North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
Europe > Middle East > Republic of Türkiye > Istanbul Province > Istanbul (0.04)
(14 more...)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.95)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.88)

Add feedback

Self-supervised Learning from a Multi-view Perspective

Tsai, Yao-Hung Hubert, Wu, Yue, Salakhutdinov, Ruslan, Morency, Louis-Philippe

arXiv.org Machine LearningOct-4-2020

As a subset of unsupervised representation learning, self-supervised representation learning adopts self-defined signals as supervision and uses the learned representation for downstream tasks, such as object detection and image captioning. Many proposed approaches for self-supervised learning follow naturally a multi-view perspective, where the input (e.g., original images) and the self-supervised signals (e.g., augmented images) can be seen as two redundant views of the data. Building from this multi-view perspective, this paper provides an information-theoretical framework to better understand the properties that encourage successful self-supervised learning. Specifically, we demonstrate that self-supervised learned representations can extract task-relevant information and discard task-irrelevant information. Our theoretical framework paves the way to a larger space of self-supervised learning objective design. In particular, we propose a composite objective that bridges the gap between prior contrastive and predictive learning objectives, and introduce an additional objective term to discard task-irrelevant information. To verify our analysis, we conduct controlled experiments to evaluate the impact of the composite objectives. We also explore our framework's empirical generalization beyond the multi-view perspective, where the cross-view redundancy may not be clearly observed.

artificial intelligence, information, machine learning, (15 more...)

arXiv.org Machine Learning

2006.05576

Country:

North America > United States (0.14)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.47)

Add feedback

AmbigQA: Answering Ambiguous Open-domain Questions

Min, Sewon, Michael, Julian, Hajishirzi, Hannaneh, Zettlemoyer, Luke

arXiv.org Artificial IntelligenceOct-4-2020

Ambiguity is inherent to open-domain question answering; especially when exploring new topics, it can be difficult to ask questions that have a single, unambiguous answer. In this paper, we introduce AmbigQA, a new open-domain question answering task which involves finding every plausible answer, and then rewriting the question for each one to resolve the ambiguity. To study this task, we construct AmbigNQ, a dataset covering 14,042 questions from NQ-open, an existing open-domain QA benchmark. We find that over half of the questions in NQ-open are ambiguous, with diverse sources of ambiguity such as event and entity references. We also present strong baseline models for AmbigQA which we show benefit from weakly supervised learning that incorporates NQ-open, strongly suggesting our new task and data will support significant future research effort. Our data and baselines are available at https://nlp.cs.washington.edu/ambigqa.

machine learning, natural language, question answering, (20 more...)

arXiv.org Artificial Intelligence

2004.10645

Country:

Europe > United Kingdom (0.14)
North America > United States > New York (0.05)
Asia > Middle East > Jordan (0.04)
(2 more...)

Genre: Research Report (1.00)

Industry:

Media > Television (1.00)
Government (0.93)
Media > Film (0.68)
Leisure & Entertainment > Sports > Soccer (0.46)

Technology:

Information Technology > Information Management (1.00)
Information Technology > Artificial Intelligence > Natural Language > Question Answering (0.55)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.48)

Add feedback

A generalized linear joint trained framework for semi-supervised learning of sparse features

Laria, Juan C., Clemmensen, Line H., Ersbøll, Bjarne K.

arXiv.org Machine LearningOct-2-2020

The elastic-net is among the most widely used types of regularization algorithms, commonly associated with the problem of supervised generalized linear model estimation via penalized maximum likelihood. Its nice properties originate from a combination of $\ell_1$ and $\ell_2$ norms, which endow this method with the ability to select variables taking into account the correlations between them. In the last few years, semi-supervised approaches, that use both labeled and unlabeled data, have become an important component in the statistical research. Despite this interest, however, few researches have investigated semi-supervised elastic-net extensions. This paper introduces a novel solution for semi-supervised learning of sparse features in the context of generalized linear model estimation: the generalized semi-supervised elastic-net (s2net), which extends the supervised elastic-net method, with a general mathematical formulation that covers, but is not limited to, both regression and classification problems. We develop a flexible and fast implementation for s2net in R, and its advantages are illustrated using both real and synthetic data sets.

artificial intelligence, machine learning, semi-supervised learning, (18 more...)

arXiv.org Machine Learning

2006.01671

Country:

Europe > Austria > Vienna (0.14)
Europe > Denmark (0.04)
Europe > Spain > Galicia > Madrid (0.04)

Genre: Research Report > Experimental Study (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Unsupervised or Indirectly Supervised Learning (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.63)

Add feedback

Covariate Shift Adaptation in High-Dimensional and Divergent Distributions

Polo, Felipe Maia, Vicente, Renato

arXiv.org Artificial IntelligenceOct-2-2020

In real world applications of supervised learning methods, training and test sets are often sampled from the distinct distributions and we must resort to domain adaptation techniques. One special class of techniques is Covariate Shift Adaptation, which allows practitioners to obtain good generalization performance in the distribution of interest when domains differ only by the marginal distribution of features. Traditionally, Covariate Shift Adaptation is implemented using Importance Weighting which may fail in high-dimensional settings due to small Effective Sample Sizes (ESS). In this paper, we propose (i) a connection between ESS, high-dimensional settings and generalization bounds and (ii) a simple, general and theoretically sound approach to combine feature selection and Covariate Shift Adaptation. The new approach yields good performance with improved ESS.

artificial intelligence, inductive learning, machine learning, (14 more...)

arXiv.org Artificial Intelligence

2010.01184

Country:

Asia > Middle East > Jordan (0.04)
South America > Brazil > São Paulo (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.88)

Add feedback

First-order Optimization for Superquantile-based Supervised Learning

Laguel, Yassine, Malick, Jérôme, Harchaoui, Zaid

arXiv.org Machine LearningOct-1-2020

Classical supervised learning via empirical risk (or negative log-likelihood) minimization hinges upon the assumption that the testing distribution coincides with the training distribution. This assumption can be challenged in modern applications of machine learning in which learning machines may operate at prediction time with testing data whose distribution departs from the one of the training data. We revisit the superquantile regression method by proposing a first-order optimization algorithm to minimize a superquantile-based learning objective. The proposed algorithm is based on smoothing the superquantile function by infimal convolution. Promising numerical results illustrate the interest of the approach towards safer supervised learning.

artificial intelligence, inductive learning, machine learning, (15 more...)

arXiv.org Machine Learning

2009.14575

Country:

Europe > France > Auvergne-Rhône-Alpes > Isère > Grenoble (0.04)
North America > United States > Washington > King County > Seattle (0.04)
North America > United States > Arizona (0.04)
Europe > Finland (0.04)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.89)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.81)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.34)

Add feedback

Self-Guided Multiple Instance Learning for Weakly Supervised Disease Classification and Localization in Chest Radiographs

Seibold, Constantin, Kleesiek, Jens, Schlemmer, Heinz-Peter, Stiefelhagen, Rainer

arXiv.org Artificial IntelligenceSep-30-2020

The lack of fine-grained annotations hinders the deployment of automated diagnosis systems, which require human-interpretable justification for their decision process. In this paper, we address the problem of weakly supervised identification and localization of abnormalities in chest radiographs. To that end, we introduce a novel loss function for training convolutional neural networks increasing the \emph{localization confidence} and assisting the overall \emph{disease identification}. The loss leverages both image- and patch-level predictions to generate auxiliary supervision. Rather than forming strictly binary from the predictions as done in previous loss formulations, we create targets in a more customized manner, which allows the loss to account for possible misclassification. We show that the supervision provided within the proposed learning scheme leads to better performance and more precise predictions on prevalent datasets for multiple-instance learning as well as on the NIH~ChestX-Ray14 benchmark for disease recognition than previously used losses.

localization, prediction, supervision, (10 more...)

arXiv.org Artificial Intelligence

doi: 10.1007/978-3-030-69541-5_37

2010.00127

Country:

Europe > United Kingdom > England (0.04)
Europe > Germany > Baden-Württemberg > Karlsruhe Region > Karlsruhe (0.04)
North America > United States > Indiana (0.04)
Europe > Germany > Baden-Württemberg > Karlsruhe Region > Heidelberg (0.04)

Genre: Research Report (0.40)

Industry:

Health & Medicine > Nuclear Medicine (1.00)
Health & Medicine > Diagnostic Medicine > Imaging (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Vision (0.95)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.93)

Add feedback

Domain Generalization via Semi-supervised Meta Learning

Sharifi-Noghabi, Hossein, Asghari, Hossein, Mehrasa, Nazanin, Ester, Martin

arXiv.org Machine LearningSep-30-2020

The goal of domain generalization is to learn from multiple source domains to generalize to unseen target domains under distribution discrepancy. Current state-of-the-art methods in this area are fully supervised, but for many real-world problems it is hardly possible to obtain enough labeled samples. In this paper, we propose the first method of domain generalization to leverage unlabeled samples, combining of meta learning's episodic training and semi-supervised learning, called DGSML. DGSML employs an entropy-based pseudo-labeling approach to assign labels to unlabeled samples and then utilizes a novel discrepancy loss to ensure that class centroids before and after labeling unlabeled samples are close to each other. To learn a domain-invariant representation, it also utilizes a novel alignment loss to ensure that the distance between pairs of class centroids, computed after adding the unlabeled samples, is preserved across different domains. DGSML is trained by a meta learning approach to mimic the distribution shift between the input source domains and unseen target domains. Experimental results on benchmark datasets indicate that DGSML outperforms state-of-the-art domain generalization and semi-supervised learning methods.

artificial intelligence, machine learning, unlabeled sample, (17 more...)

arXiv.org Machine Learning

2009.12658

Country:

Asia > Middle East > Jordan (0.04)
North America > Canada > British Columbia > Metro Vancouver Regional District > Burnaby (0.04)
Europe > Switzerland > Zürich > Zürich (0.04)

Genre: Research Report > Promising Solution (0.34)

Industry: Health & Medicine > Therapeutic Area > Oncology (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.76)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Beyond Perturbations: Learning Guarantees with Arbitrary Adversarial Test Examples

Goldwasser, Shafi, Kalai, Adam Tauman, Kalai, Yael Tauman, Montasser, Omar

arXiv.org Machine LearningSep-30-2020

We present a transductive learning algorithm that takes as input training examples from a distribution $P$ and arbitrary (unlabeled) test examples, possibly chosen by an adversary. This is unlike prior work that assumes that test examples are small perturbations of $P$. Our algorithm outputs a selective classifier, which abstains from predicting on some examples. By considering selective transductive learning, we give the first nontrivial guarantees for learning classes of bounded VC dimension with arbitrary train and test distributions---no prior guarantees were known even for simple classes of functions such as intervals on the line. In particular, for any function in a class $C$ of bounded VC dimension, we guarantee a low test error rate and a low rejection rate with respect to $P$. Our algorithm is efficient given an Empirical Risk Minimizer (ERM) for $C$. Our guarantees hold even for test examples chosen by an unbounded white-box adversary. We also give guarantees for generalization, agnostic, and unsupervised settings.

artificial intelligence, machine learning, test example, (19 more...)

arXiv.org Machine Learning

2007.05145

Country: