Inductive Learning
Semi-supervised Learning on Directed Graphs
Zhou, Dengyong, Hofmann, Thomas, Schölkopf, Bernhard
Given a directed graph in which some of the nodes are labeled, we investigate the question of how to exploit the link structure of the graph to infer the labels of the remaining unlabeled nodes. To that extent we propose a regularization framework for functions defined over nodes of a directed graph that forces the classification function to change slowly on densely linked subgraphs. A powerful, yet computationally simple classification algorithm is derived within the proposed framework. The experimental evaluation on real-world Web classification problems demonstrates encouraging results that validate our approach.
A Method for Inferring Label Sampling Mechanisms in Semi-Supervised Learning
Rosset, Saharon, Zhu, Ji, Zou, Hui, Hastie, Trevor J.
We consider the situation in semi-supervised learning, where the "label sampling" mechanism stochastically depends on the true response (as well as potentially on the features). We suggest a method of moments for estimating this stochastic dependence using the unlabeled data. This is potentially useful for two distinct purposes: a. As an input to a supervised learning procedure which can be used to "de-bias" its results using labeled data only and b.
Object Classification from a Single Example Utilizing Class Relevance Metrics
We describe a framework for learning an object classifier from a single example. This goal is achieved by emphasizing the relevant dimensions for classification using available examples of related classes. Learning to accurately classify objects from a single training example is often unfeasible due to overfitting effects. However, if the instance representation provides that the distance between each two instances of the same class is smaller than the distance between any two instances from different classes, then a nearest neighbor classifier could achieve perfect performance with a single training example. We therefore suggest a two stage strategy.
Learning Hyper-Features for Visual Identification
Ferencz, Andras D., Learned-miller, Erik G., Malik, Jitendra
We address the problem of identifying specific instances of a class (cars) from a set of images all belonging to that class. Although we cannot build a model for any particular instance (as we may be provided with only one "training" example of it), we can use information extracted from observing other members of the class. We pose this task as a learning problem, in which the learner is given image pairs, labeled as matching or not, and must discover which image features are most consistent for matching instances and discriminative for mismatches. We explore a patch based representation, where we model the distributions of similarity measurements defined on the patches. Finally, we describe an algorithm that selects the most salient patches based on a mutual information criterion. This algorithm performs identification well for our challenging dataset of car images, after matching only a few, well chosen patches.
Breaking SVM Complexity with Cross-Training
Bottou, Léon, Weston, Jason, Bakir, Gökhan H.
We propose to selectively remove examples from the training set using probabilistic estimates related to editing algorithms (Devijver and Kittler, 1982). This heuristic procedure aims at creating a separable distribution of training examples with minimal impact on the position of the decision boundary. It breaks the linear dependency between the number of SVs and the number of training examples, and sharply reduces the complexity of SVMs during both the training and prediction stages.
Semi-supervised Learning on Directed Graphs
Zhou, Dengyong, Hofmann, Thomas, Schölkopf, Bernhard
Given a directed graph in which some of the nodes are labeled, we investigate thequestion of how to exploit the link structure of the graph to infer the labels of the remaining unlabeled nodes. To that extent we propose a regularization framework for functions defined over nodes of a directed graph that forces the classification function to change slowly on densely linked subgraphs. A powerful, yet computationally simple classification algorithm is derived within the proposed framework. The experimental evaluation on real-world Web classification problems demonstrates encouraging resultsthat validate our approach.
Breaking SVM Complexity with Cross-Training
Bottou, Léon, Weston, Jason, Bakir, Gökhan H.
We propose to selectively remove examples from the training set using probabilistic estimates related to editing algorithms (Devijver and Kittler, 1982). This heuristic procedure aims at creating a separable distribution of training examples with minimal impact on the position of the decision boundary. It breaks the linear dependency between the number of SVs and the number of training examples, and sharply reduces the complexity of SVMs during both the training and prediction stages.
A Method for Inferring Label Sampling Mechanisms in Semi-Supervised Learning
Rosset, Saharon, Zhu, Ji, Zou, Hui, Hastie, Trevor J.
We consider the situation in semi-supervised learning, where the "label sampling" mechanism stochastically depends on the true response (as well as potentially on the features). We suggest a method of moments for estimating this stochastic dependence using the unlabeled data. This is potentially useful for two distinct purposes: a. As an input to a supervised learningprocedure which can be used to "de-bias" its results using labeled data only and b.
Object Classification from a Single Example Utilizing Class Relevance Metrics
We describe a framework for learning an object classifier from a single example. This goal is achieved by emphasizing the relevant dimensions for classification using available examples of related classes. Learning to accurately classify objects from a single training example is often unfeasible dueto overfitting effects. However, if the instance representation provides that the distance between each two instances of the same class is smaller than the distance between any two instances from different classes,then a nearest neighbor classifier could achieve perfect performance with a single training example. We therefore suggest a two stage strategy.
Learning Hyper-Features for Visual Identification
Ferencz, Andras D., Learned-miller, Erik G., Malik, Jitendra
We address the problem of identifying specific instances of a class (cars) from a set of images all belonging to that class. Although we cannot build a model for any particular instance (as we may be provided with only one "training" example of it), we can use information extracted from observing othermembers of the class. We pose this task as a learning problem, in which the learner is given image pairs, labeled as matching or not, and must discover which image features are most consistent for matching instances anddiscriminative for mismatches. We explore a patch based representation, where we model the distributions of similarity measurements definedon the patches. Finally, we describe an algorithm that selects the most salient patches based on a mutual information criterion. This algorithm performs identification well for our challenging dataset of car images, after matching only a few, well chosen patches.