AITopics | Supervised Learning

Collaborating Authors

Supervised Learning

Supervised learning is the machine learning task of learning a function that maps an input to an output based on example input-output pairs. (Wikipedia)

News Overviews Instructional Materials AI-Alerts Classics

HC-Search: Learning Heuristics and Cost Functions for Structured Prediction

Doppa, Janardhan Rao (Oregon State University) | Fern, Alan (Oregon State University) | Tadepalli, Prasad (Oregon State University)

AAAI ConferencesJul-9-2013

Structured prediction is the problem of learning a function from structured inputs to structured outputs with prototypical examples being part-of-speech tagging and image labeling. Inspired by the recent successes of search-based structured prediction, we introduce a new framework for structured prediction called {\em HC-Search}. Given a structured input, the framework uses a search procedure guided by a learned heuristic H to uncover high quality candidate outputs and then uses a separate learned cost function C to select a final prediction among those outputs. We can decompose the regret of the overall approach into the loss due to H not leading to high quality outputs, and the loss due to C not selecting the best among the generated outputs. Guided by this decomposition, we minimize the overall regret in a greedy stage-wise manner by first training H to quickly uncover high quality outputs via imitation learning, and then training C to correctly rank the outputs generated via H according to their true losses. Experiments on several benchmark domains show that our approach significantly outperforms the state-of-the-art methods.

artificial intelligence, cost function, inductive learning, (20 more...)

AAAI Conferences

Twenty-Seventh AAAI Conference on Artificial Intelligence

Country: North America > United States > Oregon (0.14)

Genre: Research Report > Promising Solution (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Supervised Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (1.00)

Add feedback

Teaching Classification Boundaries to Humans

Basu, Sumit (Microsoft Research) | Christensen, Janara (University of Washington)

AAAI ConferencesJul-9-2013

Given a classification task, what is the best way to teach the resulting boundary to a human? While machine learning techniques can provide excellent methods for finding the boundary, including the selection of examples in an online setting, they tell us little about how we would teach a human the same task. We propose to investigate the problem of example selection and presentation in the context of teaching humans, and explore a variety of mechanisms in the interests of finding what may work best. In particular, we begin with the baseline of random presentation and then examine combinations of several mechanisms: the indication of an example’s relative difficulty, the use of the shaping heuristic from the cognitive science literature (moving from easier examples to harder ones), and a novel kernel-based “coverage model” of the subject’s mastery of the task. From our experiments on 54 human subjects learning and performing a pair of synthetic classification tasks via our teaching system, we found that we can achieve the greatest gains with a combination of shaping and the coverage model.

artificial intelligence, boundary, inductive learning, (17 more...)

AAAI Conferences

Twenty-Seventh AAAI Conference on Artificial Intelligence

Country: North America > United States > Washington > King County > Redmond (0.14)

Genre: Research Report > New Finding (0.47)

Industry: Education > Educational Setting (0.47)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Supervised Learning (0.31)

Add feedback

Perceptron Learning of SAT

Flint, Alex, Blaschko, Matthew

Neural Information Processing SystemsDec-31-2012

Boolean satisfiability (SAT) as a canonical NP-complete decision problem is one of the most important problems in computer science. In practice, real-world SAT sentences are drawn from a distribution that may result in efficient algorithms for their solution. Such SAT instances are likely to have shared characteristics and substructures. This work approaches the exploration of a family of SAT solvers as a learning problem. In particular, we relate polynomial time solvability of a SAT subset to a notion of margin between sentences mapped by a feature function into a Hilbert space. Provided this mapping is based on polynomial time computable statistics of a sentence, we show that the existance of a margin between these data points implies the existance of a polynomial time solver for that SAT subset based on the Davis-Putnam-Logemann-Loveland algorithm. Furthermore, we show that a simple perceptron-style learning rule will find an optimal SAT solver with a bounded number of training updates. We derive a linear time computable set of features and show analytically that margins exist for important polynomial special cases of SAT. Empirical results show an order of magnitude improvement over a state-of-the-art SAT solver on a hardware verification task.

algorithm, logic programming, neural network, (22 more...)

Neural Information Processing Systems

Country:

North America > United States (0.68)
Europe > United Kingdom > England (0.14)

Genre: Research Report > New Finding (0.34)

Industry:

Information Technology (0.35)
Education (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Logic & Formal Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Supervised Learning (0.94)
(2 more...)

Add feedback

Kernel Latent SVM for Visual Recognition

Yang, Weilong, Wang, Yang, Vahdat, Arash, Mori, Greg

Neural Information Processing SystemsDec-31-2012

Latent SVMs (LSVMs) are a class of powerful tools that have been successfully applied to many applications in computer vision. However, a limitation of LSVMs is that they rely on linear models. For many computer vision tasks, linear models are suboptimal and nonlinear models learned with kernels typically perform much better. Therefore it is desirable to develop the kernel version of LSVM. In this paper, we propose kernel latent SVM (KLSVM) -- a new learning framework that combines latent SVMs and kernel methods. We develop an iterative training algorithm to learn the model parameters. We demonstrate the effectiveness of KLSVM using three different applications in visual recognition. Our KLSVM formulation is very general and can be applied to solve a wide range of applications in computer vision and machine learning.

artificial intelligence, inductive learning, latent variable, (16 more...)

Neural Information Processing Systems

Country: North America > Canada > Ontario > Toronto (0.14)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Supervised Learning (0.70)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.49)

Add feedback

3D Object Detection and Viewpoint Estimation with a Deformable 3D Cuboid Model

Fidler, Sanja, Dickinson, Sven, Urtasun, Raquel

Neural Information Processing SystemsDec-31-2012

This paper addresses the problem of category-level 3D object detection. Given a monocular image, our aim is to localize the objects in 3D by enclosing them with tight oriented 3D bounding boxes. We propose a novel approach that extends the well-acclaimed deformable part-based model[Felz.] to reason in 3D. Our model represents an object class as a deformable 3D cuboid composed of faces and parts, which are both allowed to deform with respect to their anchors on the 3D box. We model the appearance of each face in fronto-parallel coordinates, thus effectively factoring out the appearance variation induced by viewpoint. Our model reasons about face visibility patters called aspects. We train the cuboid model jointly and discriminatively and share weights across all aspects to attain efficiency. Inference then entails sliding and rotating the box in 3D and scoring object hypotheses. While for inference we discretize the search space, the variables are continuous in our model. We demonstrate the effectiveness of our approach in indoor and outdoor scenarios, and show that our approach outperforms the state-of-the-art in both 2D[Felz09] and 3D object detection[Hedau12].

artificial intelligence, detection, inductive learning, (19 more...)

Neural Information Processing Systems

Country:

North America > United States (0.28)
North America > Canada > Ontario > Toronto (0.14)

Genre: Research Report (0.34)

Industry: Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Supervised Learning (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)
(2 more...)

Add feedback

Graphical Gaussian Vector for Image Categorization

Harada, Tatsuya, Kuniyoshi, Yasuo

Neural Information Processing SystemsDec-31-2012

This paper proposes a novel image representation called a Graphical Gaussian Vector, which is a counterpart of the codebook and local feature matching approaches. In our method, we model the distribution of local features as a Gaussian Markov Random Field (GMRF) which can efficiently represent the spatial relationship among local features. We consider the parameter of GMRF as a feature vector of the image. Using concepts of information geometry, proper parameters and a metric from the GMRF can be obtained. Finally we define a new image feature by embedding the metric into the parameters, which can be directly applied to scalable linear classifiers. Our method obtains superior performance over the state-of-the-art methods in the standard object recognition datasets and comparable performance in the scene dataset. As the proposed method simply calculates the local auto-correlations of local features, it is able to achieve both high classification accuracy and high efficiency.

artificial intelligence, local feature, spatial reasoning, (17 more...)

Neural Information Processing Systems

Country: Asia > Japan (0.15)

Genre: Research Report (0.48)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.70)
Information Technology > Artificial Intelligence > Machine Learning > Supervised Learning (0.49)

Add feedback

Hyperplane Arrangements and Locality-Sensitive Hashing with Lift

Konoshima, Makiko, Noma, Yui

arXiv.org Machine LearningDec-25-2012

Locality-sensitive hashing converts high-dimensional feature vectors, such as image and speech, into bit arrays and allows high-speed similarity calculation with the Hamming distance. There is a hashing scheme that maps feature vectors to bit arrays depending on the signs of the inner products between feature vectors and the normal vectors of hyperplanes placed in the feature space. This hashing can be seen as a discretization of the feature space by hyperplanes. If labels for data are given, one can determine the hyperplanes by using learning algorithms. However, many proposed learning methods do not consider the hyperplanes' offsets. Not doing so decreases the number of partitioned regions, and the correlation between Hamming distances and Euclidean distances becomes small. In this paper, we propose a lift map that converts learning algorithms without the offsets to the ones that take into account the offsets. With this method, the learning methods without the offsets give the discretizations of spaces as if it takes into account the offsets. For the proposed method, we input several high-dimensional feature data sets and studied the relationship between the statistical characteristics of data, the number of hyperplanes, and the effect of the proposed method.

artificial intelligence, hyperplane, text processing, (15 more...)

arXiv.org Machine Learning

1212.611

Country: North America > United States > California > San Francisco County > San Francisco (0.14)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.75)
Information Technology > Artificial Intelligence > Machine Learning > Supervised Learning > Representation Of Examples (0.75)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.56)

Add feedback

Learning Mixtures of Submodular Shells with Application to Document Summarization

Lin, Hui, Bilmes, Jeff A.

arXiv.org Machine LearningOct-16-2012

We introduce a method to learn a mixture of submodular "shells" in a large-margin setting. A submodular shell is an abstract submodular function that can be instantiated with a ground set and a set of parameters to produce a submodular function. A mixture of such shells can then also be so instantiated to produce a more complex submodular function. What our algorithm learns are the mixture weights over such shells. We provide a risk bound guarantee when learning in a large-margin structured-prediction setting using a projected subgradient method when only approximate submodular optimization is possible (such as with submodular function maximization). We apply this method to the problem of multi-document summarization and produce the best results reported so far on the widely used NIST DUC-05 through DUC-07 document summarization corpora.

inductive learning, optimization problem, submodular function, (18 more...)

arXiv.org Machine Learning

1210.4871

Country:

North America > United States > New York (0.14)
North America > United States > Colorado (0.14)
North America > United States > California (0.14)
Europe > United Kingdom > Scotland (0.14)

Genre: Research Report (0.50)

Industry: Energy > Oil & Gas (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Data Science > Data Mining (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.68)
(4 more...)

Add feedback

Structured Prediction Cascades

Weiss, David, Sapp, Benjamin, Taskar, Ben

arXiv.org Machine LearningAug-6-2012

Structured prediction tasks pose a fundamental trade-off between the need for model complexity to increase predictive power and the limited computational resources for inference in the exponentially-sized output spaces such models require. We formulate and develop the Structured Prediction Cascade architecture: a sequence of increasingly complex models that progressively filter the space of possible outputs. The key principle of our approach is that each model in the cascade is optimized to accurately filter and refine the structured output state space of the next model, speeding up both learning and inference in the next layer of the cascade. We learn cascades by optimizing a novel convex loss function that controls the trade-off between the filtering efficiency and the accuracy of the cascade, and provide generalization bounds for both accuracy and efficiency. We also extend our approach to intractable models using tree-decomposition ensembles, and provide algorithms and theory for this setting. We evaluate our approach on several large-scale problems, achieving state-of-the-art performance in handwriting recognition and human pose recognition. We find that structured prediction cascades allow tremendous speedups and the use of previously intractable features and models in both settings.

artificial intelligence, cascade, inductive learning, (18 more...)

arXiv.org Machine Learning

1208.3279

Country: North America > United States > Pennsylvania > Philadelphia County > Philadelphia (0.14)

Genre: Research Report > New Finding (0.68)

Industry: Energy > Oil & Gas (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Supervised Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (1.00)

Add feedback

Ontological Smoothing for Relation Extraction with Minimal Supervision

Zhang, Congle (University of Washington) | Hoffmann, Raphael (University of Washington) | Weld, Daniel Sabey (University of Washington)

AAAI ConferencesJul-21-2012

Relation extraction, the process of converting natural language text into structured knowledge, is increasingly important. Most successful techniques use supervised machine learning to generate extractors from sentences that have been manually labeled with the relations' arguments. Unfortunately, these methods require numerous training examples, which are expensive and time-consuming to produce. This paper presents ontological smoothing, a semi-supervisedtechnique that learns extractors for a set of minimally-labeledrelations. Ontological smoothing has three phases. First, itgenerates a mapping between the target relations and a backgroundknowledge-base. Second, it uses distant supervision toheuristically generate new training examples for the targetrelations. Finally, it learns an extractor from a combination of theoriginal and newly-generated examples. Experiments on 65 relationsacross three target domains show that ontological smoothing candramatically improve precision and recall, even rivaling fully supervisedperformance in many cases.

artificial intelligence, inductive learning, relation, (19 more...)

AAAI Conferences

Twenty-Sixth AAAI Conference on Artificial Intelligence

Country: North America > United States > Washington > King County > Seattle (0.14)

Genre: Research Report (0.93)

Industry: Leisure & Entertainment > Sports (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Supervised Learning (0.69)
(2 more...)

Add feedback