AITopics | Supervised Learning

Collaborating Authors

Supervised Learning

Supervised learning is the machine learning task of learning a function that maps an input to an output based on example input-output pairs. (Wikipedia)

News Overviews Instructional Materials AI-Alerts Classics

Learning Kernels for Structured Prediction using Polynomial Kernel Transformations

Tonde, Chetan, Elgammal, Ahmed

arXiv.org Machine LearningJan-7-2016

Learning the kernel functions used in kernel methods has been a vastly explored area in machine learning. It is now widely accepted that to obtain 'good' performance, learning a kernel function is the key challenge. In this work we focus on learning kernel representations for structured regression. We propose use of polynomials expansion of kernels, referred to as Schoenberg transforms and Gegenbaur transforms, which arise from the seminal result of Schoenberg (1938). These kernels can be thought of as polynomial combination of input features in a high dimensional reproducing kernel Hilbert space (RKHS). We learn kernels over input and output for structured data, such that, dependency between kernel features is maximized. We use Hilbert-Schmidt Independence Criterion (HSIC) to measure this. We also give an efficient, matrix decomposition-based algorithm to learn these kernel transformations, and demonstrate state-of-the-art results on several real-world datasets.

artificial intelligence, inductive learning, machine learning, (17 more...)

arXiv.org Machine Learning

1601.01411

Country: North America > United States > New Jersey > Middlesex County > Piscataway (0.04)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Supervised Learning (0.83)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.83)
Information Technology > Artificial Intelligence > Machine Learning > Kernel Methods (0.74)

Add feedback

Calibrated Structured Prediction

Kuleshov, Volodymyr, Liang, Percy S.

Neural Information Processing SystemsDec-31-2015

In user-facing applications, displaying calibrated confidence measures---probabilities that correspond to true frequency---can be as important as obtaining high accuracy. We are interested in calibration for structured prediction problems such as speech recognition, optical character recognition, and medical diagnosis. Structured prediction presents new challenges for calibration: the output space is large, and users may issue many types of probability queries (e.g., marginals) on the structured output. We extend the notion of calibration so as to handle various subtleties pertaining to the structured setting, and then provide a simple recalibration method that trains a binary classifier to predict probabilities of interest. We explore a range of features appropriate for structured recalibration, and demonstrate their efficacy on three real-world datasets.

artificial intelligence, machine learning, optical character recognition, (18 more...)

Neural Information Processing Systems

Country: North America > Canada (0.28)

Industry: Health & Medicine (0.48)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Supervised Learning (0.93)
Information Technology > Artificial Intelligence > Vision > Optical Character Recognition (0.87)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.46)

Add feedback

Predtron: A Family of Online Algorithms for General Prediction Problems

Jain, Prateek, Natarajan, Nagarajan, Tewari, Ambuj

Neural Information Processing SystemsDec-31-2015

Modern prediction problems arising in multilabel learning and learning to rank pose unique challenges to the classical theory of supervised learning. These problems have large prediction and label spaces of a combinatorial nature and involve sophisticated loss functions. We offer a general framework to derive mistake driven online algorithms and associated loss bounds. The key ingredients in our framework are a general loss function, a general vector space representation of predictions, and a notion of margin with respect to a general norm. Our general algorithm, Predtron, yields the perceptron algorithm and its variants when instantiated on classic problems such as binary classification, multiclass classification, ordinal regression, and multilabel classification. For multilabel ranking and subset ranking, we derive novel algorithms, notions of margins, and loss bounds. A simulation study confirms the behavior predicted by our bounds and demonstrates the flexibility of the design choices in our framework.

artificial intelligence, inductive learning, machine learning, (17 more...)

Neural Information Processing Systems

Country: North America > United States (0.93)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Supervised Learning (0.66)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Perceptrons (0.59)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.50)

Add feedback

Neural Network Matrix Factorization

Dziugaite, Gintare Karolina, Roy, Daniel M.

arXiv.org Machine LearningDec-14-2015

Data often comes in the form of an array or matrix. Matrix factorization techniques attempt to recover missing or corrupted entries by assuming that the matrix can be written as the product of two low-rank matrices. In other words, matrix factorization approximates the entries of the matrix by a simple, fixed function---namely, the inner product---acting on the latent feature vectors for the corresponding row and column. Here we consider replacing the inner product by an arbitrary function that we learn from the data at the same time as we learn the latent feature vectors. In particular, we replace the inner product by a multi-layer feed-forward neural network, and learn by alternating between optimizing the network for fixed latent features, and optimizing the latent features for a fixed network. The resulting approach---which we call neural network matrix factorization or NNMF, for short---dominates standard low-rank techniques on a suite of benchmark but is dominated by some recent proposals that take advantage of the graph features. Given the vast range of architectures, activation functions, regularizers, and optimization techniques that could be used within the NNMF framework, it seems likely the true potential of the approach has yet to be reached.

artificial intelligence, machine learning, neural network, (17 more...)

arXiv.org Machine Learning

1511.06443

Country:

North America > Canada > Ontario > Toronto (0.14)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
North America > United States > New York > New York County > New York City (0.04)
(4 more...)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Supervised Learning > Representation Of Examples (0.56)

Add feedback

A Generative Model of Words and Relationships from Multiple Sources

Hyland, Stephanie L., Karaletsos, Theofanis, Rätsch, Gunnar

arXiv.org Machine LearningDec-3-2015

Neural language models are a powerful tool to embed words into semantic vector spaces. However, learning such models generally relies on the availability of abundant and diverse training examples. In highly specialised domains this requirement may not be met due to difficulties in obtaining a large corpus, or the limited range of expression in average use. Such domains may encode prior knowledge about entities in a knowledge base or ontology. We propose a generative model which integrates evidence from diverse data sources, enabling the sharing of semantic information. We achieve this by generalising the concept of co-occurrence from distributional semantics to include other relationships between entities or words, which we model as affine transformations on the embedding space. We demonstrate the effectiveness of this approach by outperforming recent models on a link prediction task and demonstrating its ability to profit from partially or fully unobserved data training labels. We further demonstrate the usefulness of learning from different data sources with overlapping vocabularies.

artificial intelligence, machine learning, natural language, (20 more...)

arXiv.org Machine Learning

1510.00259

Country:

North America > United States > New York > New York County > New York City (0.04)
Asia > China > Beijing > Beijing (0.04)

Genre: Research Report (1.00)

Industry:

Health & Medicine > Therapeutic Area > Oncology (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.94)
(2 more...)

Add feedback

Heterogeneous Knowledge Transfer in Video Emotion Recognition, Attribution and Summarization

Xu, Baohan, Fu, Yanwei, Jiang, Yu-Gang, Li, Boyang, Sigal, Leonid

arXiv.org Artificial IntelligenceNov-15-2015

Rapid development of mobile devices has led to an explosive growth of user-generated images and videos, which creates a demand for computational understanding of visual media content. In addition to recognition of objective content, such as objects and scenes, an important dimension of video content analysis is the understanding of emotional or affective content, i.e. estimating the emotional impact of the video on a viewer. Emotional content can strongly resonate with viewers and plays a crucial role in the videowatching experience. Some successes have been achieved with the use of deep-learning architectures trained for text at both sentence-and document-level [40] or image sentiment analysis [8]. However, the ability to understand emotions from video, to a large extent, remains an unsolved problem. Analysis of emotional content in video has many realworld applications. Video recommendation services can benefit from matching user interests with the emotions of video content and prediction of interestingness [20], [21], [36], leading to improved user satisfaction. Better understanding of video emotions may enable advertising that is consistent with the main video's mood and help avoid social inappropriateness such as placing a funny advertisement alongside a funeral video. Video summarization [68] and coding [60] can also benefit from understanding emotions, since an accurate summary should keep the emotional content conveyed by the original video.

dataset, emotion, video, (15 more...)

arXiv.org Artificial Intelligence

1511.04798

Country:

North America > Canada > Ontario > Toronto (0.14)
Asia > China > Shanghai > Shanghai (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
(4 more...)

Genre:

Research Report > New Finding (0.93)
Overview (0.93)

Industry:

Media (1.00)
Leisure & Entertainment (1.00)
Information Technology (0.67)
Education > Educational Setting (0.46)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
(7 more...)

Add feedback

Tropel: Crowdsourcing Detectors with Minimal Training

Patterson, Genevieve (Brown University) | Horn, Grant Van (California Institute of Technology) | Belongie, Serge (Cornell University and Cornell Tech) | Perona, Pietro (California Institue of Technology) | Hays, James (Brown University)

AAAI ConferencesNov-1-2015

This paper introduces the Tropel system which enables non-technical users to create arbitrary visual detectors without first annotating a training set. Our primary contribution is a crowd active learning pipeline that is seeded with only a single positive example and an unlabeled set of training images. We examine the crowd's ability to train visual detectors given severely limited training themselves. This paper presents a series of experiments that reveal the relationship between worker training, worker consensus and the average precision of detectors trained by crowd-in-the-loop active learning. In order to verify the efficacy of our system, we train detectors for bird species that work nearly as well as those trained on the exhaustively labeled CUB 200 dataset at significantly lower cost and with little effort from the end user. To further illustrate the usefulness of our pipeline, we demonstrate qualitative results on unlabeled datasets containing fashion images and street-level photographs of Paris.

artificial intelligence, detector, machine learning, (19 more...)

AAAI Conferences

Third AAAI Conference on Human Computation and Crowdsourcing

Country:

North America > United States > California (0.04)
North America > United States > Wisconsin > Dane County > Madison (0.04)
North America > United States > Kentucky (0.04)
(3 more...)

Industry: Education (0.48)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.98)
Information Technology > Artificial Intelligence > Machine Learning > Supervised Learning (0.71)
Information Technology > Communications > Social Media > Crowdsourcing (0.65)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.47)

Add feedback

Modeling Temporal Crowd Work Quality with Limited Supervision

Jung, Hyun Joon (University of Texas at Austin) | Lease, Matthew (University of Texas at Austin)

AAAI ConferencesNov-1-2015

While recent work has shown that a worker’s performance can be more accurately modeled by temporal correlation in task performance, a fundamental challenge remains in the need for expert gold labels to evaluate a worker’s performance. To solve this problem, we explore two methods of utilizing limited gold labels, initial training and periodic updating. Furthermore, we present a novel way of learning a prediction model in the absence of gold labels with uncertaintyaware learning and soft-label updating. Our experiment with a real crowdsourcing dataset demonstrates that periodic updating tends to show better performance than initial training when the number of gold labels are very limited (< 25).

data mining, gold label, machine learning, (16 more...)

AAAI Conferences

Third AAAI Conference on Human Computation and Crowdsourcing

Country:

North America > United States > Texas > Travis County > Austin (0.04)
North America > United States > New York > New York County > New York City (0.04)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.72)
Information Technology > Communications (0.68)
Information Technology > Data Science > Data Mining (0.68)
(2 more...)

Add feedback

A Review of Relational Machine Learning for Knowledge Graphs

Nickel, Maximilian, Murphy, Kevin, Tresp, Volker, Gabrilovich, Evgeniy

arXiv.org Machine LearningSep-28-2015

In this paper, we provide a review of how such statistical models can be "trained" on large knowledge graphs, and then used to predict new facts about the world (which is equivalent to predicting new edges in the graph). In particular, we discuss two fundamentally different kinds of statistical relational models, both of which can scale to massive datasets. The first is based on latent feature models such as tensor factorization and multiway neural networks. The second is based on mining observable patterns in the graph. We also show how to combine these latent and observable models to get improved modeling power at decreased computational cost. Finally, we discuss how such statistical models of graphs can be combined with text-based information extraction methods for automatically constructing knowledge graphs from the Web. To this end, we also discuss Google's Knowledge Vault project as an example of such combination.

data mining, knowledge management, machine learning, (24 more...)

arXiv.org Machine Learning

doi: 10.1109/JPROC.2015.2483592

1503.00759

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
North America > United States > Wisconsin > Dane County > Madison (0.14)
North America > United States > New York > New York County > New York City (0.04)
(23 more...)

Genre:

Research Report (1.00)
Overview (0.87)

Industry:

Media > Film (1.00)
Leisure & Entertainment (1.00)
Education (1.00)
Health & Medicine (0.92)

Technology:

Information Technology > Knowledge Management > Knowledge Engineering (1.00)
Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Semantic Networks (1.00)
(11 more...)

Add feedback

IllinoisSL: A JAVA Library for Structured Prediction

Chang, Kai-Wei, Upadhyay, Shyam, Chang, Ming-Wei, Srikumar, Vivek, Roth, Dan

arXiv.org Machine LearningSep-23-2015

IllinoisSL is a Java library for learning structured prediction models. It supports structured Support Vector Machines and structured Perceptron. The library consists of a core learning module and several applications, which can be executed from command-lines. Documentation is provided to guide users. In Comparison to other structured learning libraries, IllinoisSL is efficient, general, and easy to use.

artificial intelligence, inductive learning, machine learning, (15 more...)

arXiv.org Machine Learning

1509.07179

Country:

North America > United States > California (0.15)
North America > United States > Utah (0.05)
North America > United States > Illinois > Champaign County > Champaign (0.05)
North America > United States > Washington > King County > Redmond (0.05)

Genre: Instructional Material > Course Syllabus & Notes (0.49)

Industry: Government (0.32)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Supervised Learning (0.90)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.89)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.70)

Add feedback