AITopics | Inductive Learning

Collaborating Authors

Inductive Learning

Inductive learning, or induction, is the process of creating generalizations from individual instances.

News Overviews Instructional Materials AI-Alerts Classics

Self Supervised Representation Learning in NLP

#artificialintelligenceJun-23-2020, 06:36:23 GMT

While Computer Vision is making amazing progress on self-supervised learning only in the last few years, self-supervised learning has been a first-class citizen in NLP research for quite a while. Language Models have existed since the 90's even before the phrase "self-supervised learning" was termed. The Word2Vec paper from 2013 popularized this paradigm and the field has rapidly progressed applying these self-supervised methods across many problems. At the core of these self-supervised methods lies a framing called "pretext task" that allows us to use the data itself to generate labels and use supervised methods to solve unsupervised problems. These are also referred to as "auxiliary task" or "pre-training task". The representations learned by performing this task can be used as a starting point for our downstream supervised tasks.

artificial intelligence, machine learning, natural language, (16 more...)

#artificialintelligence

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (1.00)

Add feedback

Siamese Meta-Learning and Algorithm Selection with 'Algorithm-Performance Personas' [Proposal]

Beel, Joeran, Tyrell, Bryan, Bergman, Edward, Collins, Andrew, Nagoor, Shahad

arXiv.org Artificial IntelligenceJun-23-2020

Automated per-instance algorithm selection often outperforms single learners. Key to algorithm selection via meta-learning is often the (meta) features, which sometimes though do not provide enough information to train a meta-learner effectively. We propose a Siamese Neural Network architecture for automated algorithm selection that focuses more on 'alike performing' instances than meta-features. Our work includes a novel performance metric and method for selecting training samples. We introduce further the concept of 'Algorithm Performance Personas' that describe instances for which the single algorithms perform alike. The concept of 'alike performing algorithms' as ground truth for selecting training samples is novel and provides a huge potential as we believe. In this proposal, we outline our ideas in detail and provide the first evidence that our proposed metric is better suitable for training sample selection that standard performance metrics such as absolute errors.

artificial intelligence, inductive learning, machine learning, (13 more...)

arXiv.org Artificial Intelligence

2006.12328

Country:

Europe > Ireland > Leinster > County Dublin > Dublin (0.05)
Europe > Germany (0.04)
Asia (0.04)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (0.95)
(2 more...)

Add feedback

A General Class of Transfer Learning Regression without Implementation Cost

Minami, Shunya, Liu, Song, Wu, Stephen, Fukumizu, Kenji, Yoshida, Ryo

arXiv.org Machine LearningJun-23-2020

We propose a novel framework that unifies and extends existing methods of transfer learning (TL) for regression. To bridge a pretrained source model to the model on a target task, we introduce a density-ratio reweighting function, which is estimated through the Bayesian framework with a specific prior distribution. By changing two intrinsic hyperparameters and the choice of the density-ratio model, the proposed method can integrate three popular methods of TL: TL based on cross-domain similarity regularization, a probabilistic TL using the density-ratio estimation, and fine-tuning of pretrained neural networks. Moreover, the proposed method can benefit from its simple implementation without any additional cost; the model can be fully trained using off-the-shelf libraries for supervised learning in which the original output variable is simply transformed to a new output. We demonstrate its simplicity, generality, and applicability using various real data applications.

artificial intelligence, inorganic, machine learning, (19 more...)

arXiv.org Machine Learning

2006.13228

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
North America > United States > Connecticut > New Haven County > Wallingford (0.04)
Asia > Japan (0.04)

Genre: Research Report (0.63)

Industry: Energy > Renewable (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Transfer Learning (0.70)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.66)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.48)
(2 more...)

Add feedback

Relation Adversarial Network for Low Resource Knowledge Graph Completion

Zhang, Ningyu, Deng, Shumin, Sun, Zhanlin, Chen, Jiaoayan, Zhang, Wei, Chen, Huajun

arXiv.org Artificial IntelligenceJun-22-2020

Knowledge Graph Completion (KGC) has been proposed to improve Knowledge Graphs by filling in missing connections via link prediction or relation extraction. One of the main difficulties for KGC is a low resource problem. Previous approaches assume sufficient training triples to learn versatile vectors for entities and relations, or a satisfactory number of labeled sentences to train a competent relation extraction model. However, low resource relations are very common in KGs, and those newly added relations often do not have many known samples for training. In this work, we aim at predicting new facts under a challenging setting where only limited training instances are available. We propose a general framework called Weighted Relation Adversarial Network, which utilizes an adversarial procedure to help adapt knowledge/features learned from high resource relations to different but related low resource relations. Specifically, the framework takes advantage of a relation discriminator to distinguish between samples from different relations, and help learn relation-invariant features more transferable from source relations to target relations. Experimental results show that the proposed approach outperforms previous methods regarding low resource settings for both link prediction and relation extraction.

machine learning, natural language, relation, (18 more...)

arXiv.org Artificial Intelligence

1911.03091

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Asia > Taiwan > Taiwan Province > Taipei (0.05)
Asia > China (0.04)
(13 more...)

Genre: Research Report > New Finding (1.00)

Industry:

Leisure & Entertainment (0.67)
Media (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Semantic Networks (0.92)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)
(2 more...)

Add feedback

Don't Wait, Just Weight: Improving Unsupervised Representations by Learning Goal-Driven Instance Weights

Ericsson, Linus, Gouk, Henry, Hospedales, Timothy M.

arXiv.org Machine LearningJun-22-2020

In the absence of large labelled datasets, self-supervised learning techniques can boost performance by learning useful representations from unlabelled data, which is often more readily available. However, there is often a domain shift between the unlabelled collection and the downstream target problem data. We show that by learning Bayesian instance weights for the unlabelled data, we can improve the downstream classification accuracy by prioritising the most useful instances. Additionally, we show that the training time can be reduced by discarding unnecessary datapoints. Our method, BetaDataWeighter is evaluated using the popular self-supervised rotation prediction task on STL-10 and Visual Decathlon. We compare to related instance weighting schemes, both hand-designed heuristics and meta-learning, as well as conventional self-supervised learning. BetaDataWeighter achieves both the highest average accuracy and rank across datasets, and on STL-10 it prunes up to 78% of unlabelled images without significant loss in accuracy, corresponding to over 50% reduction in training time.

artificial intelligence, betadataweighter, machine learning, (19 more...)

arXiv.org Machine Learning

2006.1236

Genre: Research Report > New Finding (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.34)

Add feedback

Self-PU: Self Boosted and Calibrated Positive-Unlabeled Training

Chen, Xuxi, Chen, Wuyang, Chen, Tianlong, Yuan, Ye, Gong, Chen, Chen, Kewei, Wang, Zhangyang

arXiv.org Machine LearningJun-22-2020

Many real-world applications have to tackle the Positive-Unlabeled (PU) learning problem, i.e., learning binary classifiers from a large amount of unlabeled data and a few labeled positive examples. While current state-of-the-art methods employ importance reweighting to design various risk estimators, they ignored the learning capability of the model itself, which could have provided reliable supervision. This motivates us to propose a novel Self-PU learning framework, which seamlessly integrates PU learning and self-training. Self-PU highlights three "self"-oriented building blocks: a self-paced training algorithm that adaptively discovers and augments confident positive/negative examples as the training proceeds; a self-calibrated instance-aware loss; and a self-distillation scheme that introduces teacher-students learning as an effective regularization for PU learning. We demonstrate the state-of-the-art performance of Self-PU on common PU learning benchmarks (MNIST and CIFAR-10), which compare favorably against the latest competitors. Moreover, we study a real-world application of PU learning, i.e., classifying brain images of Alzheimer's Disease. Self-PU obtains significantly improved results on the renowned Alzheimer's Disease Neuroimaging Initiative (ADNI) database over existing methods. The code is publicly available at: https://github.com/TAMU-VITA/Self-PU.

artificial intelligence, machine learning, self-pu, (17 more...)

arXiv.org Machine Learning

2006.1128

Country:

Europe > Austria > Vienna (0.14)
North America > United States > Texas > Brazos County > College Station (0.04)
Asia > China > Jiangsu Province > Nanjing (0.04)

Genre: Research Report (1.00)

Industry:

Health & Medicine > Therapeutic Area > Neurology > Alzheimer's Disease (1.00)
Education (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.90)

Add feedback

Mixup Training as the Complexity Reduction

Kimura, Masanari

arXiv.org Machine LearningJun-21-2020

Machine learning has achieved remarkable results in recent years due to the increase in the number of data and the development of computational resources. However, despite such excellent performance, machine learning models often suffer from the problem of over-fitting. Many data augmentation methods have been proposed to tackle such a problem, and one of them is called Mixup. Mixup is a recently proposed regularization procedure, which linearly interpolates a random pair of training examples. This regularization method works very well experimentally, but its theoretical guarantee is not fully discussed. In this study, we aim to find out why Mixup works well from the aspect of computational learning theory. In addition, we reveal how the effect of Mixup changes in each situation. Furthermore, we also investigated the effects of changes in the Mixup's parameter. This contributes to the search for the optimal parameters and to estimate the effects of the parameters currently used. The results of this study provide a theoretical clarification of when and how effective regularization by Mixup is.

artificial intelligence, machine learning, rademacher complexity, (17 more...)

arXiv.org Machine Learning

2006.06231

Country:

North America > United States > California > Los Angeles County > Long Beach (0.04)
Asia > Myanmar > Tanintharyi Region > Dawei (0.04)
Asia > China (0.04)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.68)

Add feedback

Embodied Self-supervised Learning by Coordinated Sampling and Training

Sun, Yifan, Wu, Xihong

arXiv.org Machine LearningJun-20-2020

Self-supervised learning can significantly improve the performance of downstream tasks, however, the dimensions of learned representations normally lack explicit physical meanings. In this work, we propose a novel self-supervised approach to solve inverse problems by employing the corresponding physical forward process so that the learned representations can have explicit physical meanings. The proposed approach works in an analysis-by-synthesis manner to learn an inference network by iteratively sampling and training. At the sampling step, given observed data, the inference network is used to approximate the intractable posterior, from which we sample input parameters and feed them to a physical process to generate data in the observational space; At the training step, the same network is optimized with the sampled paired data. We prove the feasibility of the proposed method by tackling the acoustic-to-articulatory inversion problem to infer articulatory information from speech. Given an articulatory synthesizer, an inference model can be trained completely from scratch with random initialization. Our experiments demonstrate that the proposed method can converge steadily and the network learns to control the articulatory synthesizer to speak like a human. We also demonstrate that trained models can generalize well to unseen speakers or even new languages, and performance can be further improved through self-adaptation.

artificial intelligence, inverse problem, machine learning, (18 more...)

arXiv.org Machine Learning

2006.1335

Country: Asia > China (0.04)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.95)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.71)

Add feedback

Get More Out of Your Annotated Medical Images with Self-Supervised Learning

#artificialintelligenceJun-19-2020, 03:02:13 GMT

Data scarcity is a perennial problem when applying deep learning (DL) to medical imaging. In vision tasks related to natural images, DL practitioners often have access to astoundingly large annotated data sets on which they can train. However, due to privacy concerns and the expense of creating them, access to large annotated data sets is rare in medical imaging. The natural follow-up question is: How can practitioners in the field of medical imaging best use DL given limited data? In this article, I'll discuss one approach to stretch the use of available data, called self-supervised learning.

artificial intelligence, machine learning, representation, (18 more...)

#artificialintelligence

Industry: Health & Medicine > Diagnostic Medicine > Imaging (1.00)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.79)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.35)

Add feedback

Transfer Learning or Self-supervised Learning? A Tale of Two Pretraining Paradigms

Yang, Xingyi, He, Xuehai, Liang, Yuxiao, Yang, Yue, Zhang, Shanghang, Xie, Pengtao

arXiv.org Machine LearningJun-19-2020

Pretraining has become a standard technique in computer vision and natural language processing, which usually helps to improve performance substantially. Previously, the most dominant pretraining method is transfer learning (TL), which uses labeled data to learn a good representation network. Recently, a new pretraining approach -- self-supervised learning (SSL) -- has demonstrated promising results on a wide range of applications. SSL does not require annotated labels. It is purely conducted on input data by solving auxiliary tasks defined on the input data examples. The current reported results show that in certain applications, SSL outperforms TL and the other way around in other applications. There has not been a clear understanding on what properties of data and tasks render one approach outperforms the other. Without an informed guideline, ML researchers have to try both methods to find out which one is better empirically. It is usually time-consuming to do so. In this work, we aim to address this problem. We perform a comprehensive comparative study between SSL and TL regarding which one works better under different properties of data and tasks, including domain difference between source and target tasks, the amount of pretraining data, class imbalance in source data, and usage of target data for additional pretraining, etc. The insights distilled from our comparative studies can help ML researchers decide which method to use based on the properties of their applications.

artificial intelligence, machine learning, target task, (16 more...)

arXiv.org Machine Learning

2007.04234

Country:

North America > United States > California > San Diego County > San Diego (0.04)
North America > United States > California > Alameda County > Berkeley (0.04)

Genre: Research Report > New Finding (0.34)

Industry: Health & Medicine > Diagnostic Medicine > Imaging (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.85)
Information Technology > Artificial Intelligence > Machine Learning > Transfer Learning (0.71)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback