AITopics | Inductive Learning

Collaborating Authors

Inductive Learning

Inductive learning, or induction, is the process of creating generalizations from individual instances.

News Overviews Instructional Materials AI-Alerts Classics

Meta-learning for mixed linear regression

Kong, Weihao, Somani, Raghav, Song, Zhao, Kakade, Sham, Oh, Sewoong

arXiv.org Machine LearningFeb-20-2020

Recent advances in machine learning highlight successes on a small set of tasks where a large number of labeled examples have been collected and exploited. These include image classification with 1.2 million labeled examples Deng et al. (2009) and French-English machine translation with 40 million paired sentences Bojar et al. (2014). For common tasks, however, collecting clean labels is costly, as they require human expertise (as in medical imaging) or physical interactions (as in robotics), for example. Thus collected real-world datasets follow a long-tailed distribution, in which a dominant set of tasks only have a small number of training examples Wang et al. (2017). Inspired by human ingenuity in quickly solving novel problems by leveraging prior experience, meta-learning approaches aim to jointly learn from past experience to quickly adapt to new tasks with little available data Schmidhuber (1987); Thrun & Pratt (2012). This has had a significant impact in few-shot supervised learning, where each task is associated with only a few training examples. By leveraging structural similarities among those tasks, one can achieve accuracy far greater than what can be achieved for each task in isolation Finn et al. (2017); Ravi & Larochelle (2016); Koch et al. (2015); Oreshkin et al. (2018); Triantafillou et al. (2019); Rusu et al. (2018). The success of such approaches hinges on the following fundamental question: When can we jointly train small data tasks to achieve the accuracy of large data tasks? We investigate this tradeoff under a canonical scenario where the tasks are linear regressions in d-dimensions and the regression parameters are drawn i.i.d.

estimation, estimator, probability, (15 more...)

arXiv.org Machine Learning

2002.08936

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > Germany > North Rhine-Westphalia > Upper Bavaria > Munich (0.04)

Genre: Research Report (0.82)

Industry: Health & Medicine > Diagnostic Medicine > Imaging (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.71)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.45)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.45)

Add feedback

A Structured Prediction Approach for Conditional Meta-Learning

Wang, Ruohan, Demiris, Yiannis, Ciliberto, Carlo

arXiv.org Machine LearningFeb-20-2020

Optimization-based meta-learning algorithms are a powerful class of methods for learning-to-learn applications such as few-shot learning. They tackle the limited availability of training data by leveraging the experience gained from previously observed tasks. However, when the complexity of the tasks distribution cannot be captured by a single set of shared meta-parameters, existing methods may fail to fully adapt to a target task. We address this issue with a novel perspective on conditional meta-learning based on structured prediction. We propose task-adaptive structured meta-learning (TASML), a principled estimator that weighs meta-training data conditioned on the target task to design tailored meta-learning objectives. In addition, we introduce algorithmic improvements to tackle key computational limitations of existing methods. Experimentally, we show that TASML outperforms state-of-the-art methods on benchmark datasets both in terms of accuracy and efficiency. An ablation study quantifies the individual contribution of model components and suggests useful practices for meta-learning.

algorithm, meta-learning, tasml, (12 more...)

arXiv.org Machine Learning

2002.08799

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report > Promising Solution (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Supervised Learning (0.64)

Add feedback

Estimating Training Data Influence by Tracking Gradient Descent

Pruthi, Garima, Liu, Frederick, Sundararajan, Mukund, Kale, Satyen

arXiv.org Machine LearningFeb-19-2020

We introduce a method called TrackIn that computes the influence of a training example on a prediction made by the model, by tracking how the loss on the test point changes during the training process whenever the training example of interest was utilized. We provide a scalable implementation of TrackIn via a combination of a few key ideas: (a) a first-order approximation to the exact computation, (b) using random projections to speed up the computation of the first-order approximation for large models, (c) using saved checkpoints of standard training procedures, and (d) cherry-picking layers of a deep neural network. An experimental evaluation shows that TrackIn is more effective in identifying mislabelled training examples than other related methods such as influence functions and representer points. We also discuss insights from applying the method on vision, regression and natural language tasks.

artificial intelligence, checkpoint, machine learning, (21 more...)

arXiv.org Machine Learning

2002.08484

Country:

South America > Colombia > Bogotá D.C. > Bogotá (0.14)
Europe > Austria > Vienna (0.14)
North America > United States > California > Santa Clara County > Palo Alto (0.04)
(8 more...)

Genre: Research Report (0.82)

Industry:

Government (0.68)
Transportation (0.46)
Health & Medicine > Therapeutic Area > Ophthalmology/Optometry (0.46)
Banking & Finance (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.51)

Add feedback

Progressive Identification of True Labels for Partial-Label Learning

Lv, Jiaqi, Xu, Miao, Feng, Lei, Niu, Gang, Geng, Xin, Sugiyama, Masashi

arXiv.org Machine LearningFeb-19-2020

Partial-label learning is one of the important weakly supervised learning problems, where each training example is equipped with a set of candidate labels that contains the true label. Most existing methods elaborately designed learning objectives as constrained optimizations that must be solved in specific manners, making their computational complexity a bottleneck for scaling up to big data. The goal of this paper is to propose a novel framework of partial-label learning without implicit assumptions on the model or optimization algorithm. More specifically, we propose a general estimator of the classification risk, theoretically analyze the classifier-consistency, and establish an estimation error bound. We then explore a progressive identification method for approximately minimizing the proposed risk estimator, where the update of the model and identification of true labels are conducted in a seamless manner. The resulting algorithm is model-independent and loss-independent, and compatible with stochastic optimization. Thorough experiments demonstrate it sets the new state of the art.

accuracy, dataset, pll, (14 more...)

arXiv.org Machine Learning

2002.08053

Country:

Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.14)
North America > United States > California > San Francisco County > San Francisco (0.14)
Europe > Switzerland > Zürich > Zürich (0.14)
(15 more...)

Genre: Research Report (0.50)

Industry: Education (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (1.00)

Add feedback

Handling Missing Annotations in Supervised Learning Data

Abdel-Hakim, Alaa E., Deabes, Wael

arXiv.org Artificial IntelligenceFeb-17-2020

Data annotation is an essential stage in supervised learning. However, the annotation process is exhaustive and time consuming, specially for large datasets. Activities of Daily Living (ADL) recognition is an example of systems that exploit very large raw sensor data readings. In such systems, sensor readings are collected from activity-monitoring sensors in a 24/7 manner. The size of the generated dataset is so huge that it is almost impossible for a human annotator to give a certain label to every single instance in the dataset. This results in annotation gaps in the input data to the adopting supervised learning system. The performance of the recognition system is negatively affected by these gaps. In this work, we propose and investigate three different paradigms to handle these gaps. In the first paradigm, the gaps are taken out by dropping all unlabeled readings. A single "Unknown" or "Do-Nothing" label is given to the unlabeled readings within the operation of the second paradigm. The last paradigm handles these gaps by giving every one of them a unique label identifying the encapsulating deterministic labels. Also, we propose a semantic preprocessing method of annotation gaps by constructing a hybrid combination of some of these paradigms for further performance improvement. The performance of the proposed three paradigms and their hybrid combination is evaluated using an ADL benchmark dataset containing more than $2.5\times 10^6$ sensor readings that had been collected over more than nine months. The evaluation results emphasize the performance contrast under the operation of each paradigm and support a specific gap handling approach for better performance.

annotation gap, paradigm, unlabeled data, (13 more...)

arXiv.org Artificial Intelligence

doi: 10.3390/a12100217

2002.07113

Country:

Asia > Middle East > Saudi Arabia (0.04)
Africa > Middle East > Egypt (0.04)
South America > Brazil (0.04)
Europe > Poland > Lower Silesia Province > Wroclaw (0.04)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.68)

Add feedback

Class-Imbalanced Semi-Supervised Learning

Hyun, Minsung, Jeong, Jisoo, Kwak, Nojun

arXiv.org Machine LearningFeb-17-2020

Semi-Supervised Learning (SSL) has achieved great success in overcoming the difficulties of labeling and making full use of unlabeled data. However, SSL has a limited assumption that the numbers of samples in different classes are balanced, and many SSL algorithms show lower performance for the datasets with the imbalanced class distribution. In this paper, we introduce a task of class-imbalanced semi-supervised learning (CISSL), which refers to semi-supervised learning with class-imbalanced data. In doing so, we consider class imbalance in both labeled and unlabeled sets. First, we analyze existing SSL methods in imbalanced environments and examine how the class imbalance affects SSL methods. Then we propose Suppressed Consistency Loss (SCL), a regularization method robust to class imbalance. Our method shows better performance than the conventional methods in the CISSL environment. In particular, the more severe the class imbalance and the smaller the size of the labeled data, the better our method performs.

dataset, experiment, unlabeled data, (14 more...)

arXiv.org Machine Learning

2002.06815

Country: Asia > South Korea > Seoul > Seoul (0.04)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Unsupervised or Indirectly Supervised Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

Supervised Learning vs Unsupervised & Semi Supervised in One Picture

#artificialintelligenceFeb-16-2020, 21:09:17 GMT

Machine learning algorithms learn in three ways: unsupervised, supervised, and semi supervised. This picture illustrates the differences between the three types.

regression, supervised learning vs unsupervised, unsupervised & semi supervised

#artificialintelligence

Technology: Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (1.00)

Add feedback

Semi-Supervised Learning with Adversarially Missing Label Information

Syed, Umar, Taskar, Ben

Neural Information Processing SystemsFeb-15-2020, 19:57:10 GMT

We address the problem of semi-supervised learning in an adversarial setting. Instead of assuming that labels are missing at random, we analyze a less favorable scenario where the label information can be missing partially and arbitrarily, which is motivated by several practical examples. Motivated by the analysis, we formulate a convex optimization problem for parameter estimation, derive an efficient algorithm, and analyze its convergence. We provide experimental results on several standard data sets showing the robustness of our algorithm to the pattern of missing label information, outperforming several strong baselines. Papers published at the Neural Information Processing Systems Conference.

adversarially missing label information, algorithm, semi-supervised learning

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Unsupervised or Indirectly Supervised Learning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.68)

Add feedback

Good Semi-supervised Learning That Requires a Bad GAN

Dai, Zihang, Yang, Zhilin, Yang, Fan, Cohen, William W., Salakhutdinov, Russ R.

Neural Information Processing SystemsFeb-15-2020, 19:42:32 GMT

Semi-supervised learning methods based on generative adversarial networks (GANs) obtained strong empirical results, but it is not clear 1) how the discriminator benefits from joint training with a generator, and 2) why good semi-supervised classification performance and a good generator cannot be obtained at the same time. Theoretically we show that given the discriminator objective, good semi-supervised learning indeed requires a bad generator, and propose the definition of a preferred generator. Empirically, we derive a novel formulation based on our analysis that substantially improves over feature matching GANs, obtaining state-of-the-art results on multiple benchmark datasets. Papers published at the Neural Information Processing Systems Conference.

generator, good semi-supervised, semi-supervised, (1 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Unsupervised or Indirectly Supervised Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (1.00)

Add feedback

Structure Regularization for Structured Prediction

Sun, Xu

Neural Information Processing SystemsFeb-15-2020, 19:27:26 GMT

While there are many studies on weight regularization, the study on structure regularization is rare. Many existing systems on structured prediction focus on increasing the level of structural dependencies within the model. However, this trend could have been misdirected, because our study suggests that complex structures are actually harmful to generalization ability in structured prediction. To control structure-based overfitting, we propose a structure regularization framework via \emph{structure decomposition}, which decomposes training samples into mini-samples with simpler structures, deriving a model with better generalization power. We show both theoretically and empirically that structure regularization can effectively control overfitting risk and lead to better accuracy.

structure regularization, structured prediction, training speed

Neural Information Processing Systems

Genre: Research Report (0.44)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Supervised Learning (0.91)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.91)

Add feedback