AITopics | Inductive Learning

Collaborating Authors

Inductive Learning

Inductive learning, or induction, is the process of creating generalizations from individual instances.

News Overviews Instructional Materials AI-Alerts Classics

Fast classification using sparse decision DAGs

Benbouzid, Djalel, Busa-Fekete, Robert, Kegl, Balazs

arXiv.org Machine LearningJun-27-2012

In this paper we propose an algorithm that builds sparse decision DAGs (directed acyclic graphs) from a list of base classifiers provided by an external learning method such as AdaBoost. The basic idea is to cast the DAG design task as a Markov decision process. Each instance can decide to use or to skip each base classifier, based on the current state of the classifier being built. The result is a sparse decision DAG where the base classifiers are selected in a data-dependent way. The method has a single hyperparameter with a clear semantics of controlling the accuracy/speed trade-off. The algorithm is competitive with state-of-the-art cascade detectors on three object-detection benchmarks, and it clearly outperforms them when there is a small number of base classifiers. Unlike cascades, it is also readily applicable for multi-class classification. Using the multi-class setup, we show on a benchmark web page ranking data set that we can significantly improve the decision speed without harming the performance of the ranker.

classifier, machine learning, reinforcement learning, (19 more...)

arXiv.org Machine Learning

1206.6387

Country: Europe > United Kingdom (0.28)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.70)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.48)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.46)

Add feedback

Transductive Classification Methods for Mixed Graphs

Sellamanickam, Sundararajan, Selvaraj, Sathiya Keerthi

arXiv.org Machine LearningJun-26-2012

In this paper we provide a principled approach to solve a transductive classification problem involving a similar graph (edges tend to connect nodes with same labels) and a dissimilar graph (edges tend to connect nodes with opposing labels). Most of the existing methods, e.g., Information Regularization (IR), Weighted vote Relational Neighbor classifier (WvRN) etc, assume that the given graph is only a similar graph. We extend the IR and WvRN methods to deal with mixed graphs. We evaluate the proposed extensions on several benchmark datasets as well as two real world datasets and demonstrate the usefulness of our ideas. Categories and Subject Descriptors: I.5[Pattern Recognition] Design Methodology - Classifier design and evaluation General Terms: Algorithms, Experimentation Keywords: Classification, Graph based semi-supervised learning, Transductive learning, Mixed graphs

artificial intelligence, inductive learning, machine learning, (17 more...)

arXiv.org Machine Learning

1206.6015

Country: North America > United States (0.28)

Genre: Research Report > Experimental Study (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.34)

Add feedback

Bayesian Active Distance Metric Learning

Yang, Liu, Jin, Rong, Sukthankar, Rahul

arXiv.org Machine LearningJun-20-2012

Distance metric learning is an important component for many tasks, such as statistical classification and content-based image retrieval. Existing approaches for learning distance metrics from pairwise constraints typically suffer from two major problems. First, most algorithms only offer point estimation of the distance metric and can therefore be unreliable when the number of training examples is small. Second, since these algorithms generally select their training examples at random, they can be inefficient if labeling effort is limited. This paper presents a Bayesian framework for distance metric learning that estimates a posterior distribution for the distance metric from labeled pairwise constraints. We describe an efficient algorithm based on the variational method for the proposed Bayesian approach. Furthermore, we apply the proposed Bayesian framework to active distance metric learning by selecting those unlabeled example pairs with the greatest uncertainty in relative distance. Experiments in classification demonstrate that the proposed framework achieves higher classification accuracy and identifies more informative training examples than the non-Bayesian approach and state-of-the-art distance metric learning algorithms.

artificial intelligence, inductive learning, machine learning, (15 more...)

arXiv.org Machine Learning

1206.5283

Country: North America > United States > Michigan (0.28)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (1.00)

Add feedback

Shift-Invariance Sparse Coding for Audio Classification

Grosse, Roger, Raina, Rajat, Kwong, Helen, Ng, Andrew Y.

arXiv.org Machine LearningJun-20-2012

Sparse coding is an unsupervised learning algorithm that learns a succinct high-level representation of the inputs given only unlabeled data; it represents each input as a sparse linear combination of a set of basis functions. Originally applied to modeling the human visual cortex, sparse coding has also been shown to be useful for self-taught learning, in which the goal is to solve a supervised classification task given access to additional unlabeled data drawn from different classes than that in the supervised learning problem. Shift-invariant sparse coding (SISC) is an extension of sparse coding which reconstructs a (usually time-series) input using all of the basis functions in all possible shifts. In this paper, we present an efficient algorithm for learning SISC bases. Our method is based on iteratively solving two large convex optimization problems: The first, which computes the linear coefficients, is an L1-regularized linear least squares problem with potentially hundreds of thousands of variables. Existing methods typically use a heuristic to select a small subset of the variables to optimize, but we present a way to efficiently compute the exact solution. The second, which solves for bases, is a constrained linear least squares problem. By optimizing over complex-valued variables in the Fourier domain, we reduce the coupling between the different variables, allowing the problem to be solved efficiently. We show that SISC's learned high-level representations of speech and music provide useful features for classification tasks within those domains. When applied to classification, under certain conditions the learned features outperform state of the art spectral and cepstral features.

artificial intelligence, inductive learning, machine learning, (17 more...)

arXiv.org Machine Learning

1206.5241

Country: North America > United States (0.14)

Genre: Research Report (0.64)

Industry:

Education (0.48)
Media (0.46)
Leisure & Entertainment (0.46)
Health & Medicine > Therapeutic Area > Neurology (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Unsupervised or Indirectly Supervised Learning (0.90)

Add feedback

Analysis of Semi-Supervised Learning with the Yarowsky Algorithm

Haffari, Gholam Reza, Sarkar, Anoop

arXiv.org Machine LearningJun-20-2012

The Yarowsky algorithm is a rule-based semi-supervised learning algorithm that has been successfully applied to some problems in computational linguistics. The algorithm was not mathematically well understood until (Abney 2004) which analyzed some specific variants of the algorithm, and also proposed some new algorithms for bootstrapping. In this paper, we extend Abney's work and show that some of his proposed algorithms actually optimize (an upper-bound on) an objective function based on a new definition of cross-entropy which is based on a particular instantiation of the Bregman distance between probability distributions. Moreover, we suggest some new algorithms for rule-based semi-supervised learning and show connections with harmonic functions and minimum multi-way cuts in graph-based semi-supervised learning.

artificial intelligence, inductive learning, machine learning, (17 more...)

arXiv.org Machine Learning

1206.524

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Unsupervised or Indirectly Supervised Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (1.00)

Add feedback

Learning the Experts for Online Sequence Prediction

Eban, Elad, Birnbaum, Aharon, Shalev-Shwartz, Shai, Globerson, Amir

arXiv.org Artificial IntelligenceJun-18-2012

Online sequence prediction is the problem of predicting the next element of a sequence given previous elements. This problem has been extensively studied in the context of individual sequence prediction, where no prior assumptions are made on the origin of the sequence. Individual sequence prediction algorithms work quite well for long sequences, where the algorithm has enough time to learn the temporal structure of the sequence. However, they might give poor predictions for short sequences. A possible remedy is to rely on the general model of prediction with expert advice, where the learner has access to a set of $r$ experts, each of which makes its own predictions on the sequence. It is well known that it is possible to predict almost as well as the best expert if the sequence length is order of $\log(r)$. But, without firm prior knowledge on the problem, it is not clear how to choose a small set of {\em good} experts. In this paper we describe and analyze a new algorithm that learns a good set of experts using a training set of previously observed sequences. We demonstrate the merits of our approach by applying it on the task of click prediction on the web.

artificial intelligence, machine learning, sequence, (18 more...)

arXiv.org Artificial Intelligence

1206.4604

Country:

Europe > United Kingdom > Scotland > City of Edinburgh > Edinburgh (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > Russia > Central Federal District > Moscow Oblast > Moscow (0.04)
(2 more...)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models (0.47)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.46)

Add feedback

Total Variation and Euler's Elastica for Supervised Learning

Lin, Tong, Xue, Hanlin, Wang, Ling, Zha, Hongbin

arXiv.org Machine LearningJun-18-2012

In recent years, total variation (TV) and Euler's elastica (EE) have been successfully applied to image processing tasks such as denoising and inpainting. This paper investigates how to extend TV and EE to the supervised learning settings on high dimensional data. The supervised learning problem can be formulated as an energy functional minimization under Tikhonov regularization scheme, where the energy is composed of a squared loss and a total variation smoothing (or Euler's elastica smoothing). Its solution via variational principles leads to an Euler-Lagrange PDE. However, the PDE is always high-dimensional and cannot be directly solved by common methods. Instead, radial basis functions are utilized to approximate the target function, reducing the problem to finding the linear coefficients of basis functions. We apply the proposed methods to supervised learning tasks (including binary classification, multi-class classification, and regression) on benchmark data sets. Extensive experiments have demonstrated promising results of the proposed methods.

artificial intelligence, machine learning, variation, (13 more...)

arXiv.org Machine Learning

1206.4641

Country: Europe > United Kingdom (0.28)

Genre: Research Report > New Finding (1.00)

Industry: Education (0.49)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (1.00)

Add feedback

Bayesian Out-Trees

Jebara, Tony S.

arXiv.org Machine LearningJun-13-2012

A Bayesian treatment of latent directed graph structure for non-iid data is provided where each child datum is sampled with a directed conditional dependence on a single unknown parent datum. The latent graph structure is assumed to lie in the family of directed out-tree graphs which leads to efficient Bayesian inference. The latent likelihood of the data and its gradients are computable in closed form via Tutte's directed matrix tree theorem using determinants and inverses of the out-Laplacian. This novel likelihood subsumes iid likelihood, is exchangeable and yields efficient unsupervised and semi-supervised learning algorithms. In addition to handling taxonomy and phylogenetic datasets the out-tree assumption performs surprisingly well as a semi-parametric density estimator on standard iid datasets. Experiments with unsupervised and semisupervised learning are shown on various UCI and taxonomy datasets.

artificial intelligence, likelihood, machine learning, (17 more...)

arXiv.org Machine Learning

1206.3269

Genre: Research Report (0.40)

Industry: Health & Medicine > Therapeutic Area (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.89)

Add feedback

Challenges and Opportunities in Applied Machine Learning

Brodley, Carla E. (Tufts University) | Rebbapragada, Umaa (Jet Propulsion Laboratory) | Small, Kevin (Tufts Medical Center) | Wallace, Byron (Tufts University)

AI MagazineMay-14-2012

Machine learning research is often conducted in vitro, divorced from motivating practical applications. A researcher might develop a new method for the general task of classification, then assess its utility by comparing its performance (such as accuracy or AUC) to that of existing classification models on publicly available datasets. In terms of advancing machine learning as an academic discipline, this approach has thus far proven quite fruitful. However, it is our view that the most interesting open problems in machine learning are those that arise during its application to real-world problems. We illustrate this point by reviewing two of our interdisciplinary collaborations, both of which have posed unique machine learning problems, providing fertile ground for novel research.

artificial intelligence, brodley, machine learning, (17 more...)

AI Magazine

Country: North America > United States > California (0.93)

Genre: Research Report > New Finding (0.93)

Industry:

Health & Medicine > Therapeutic Area (0.93)
Education (0.87)
Information Technology > Security & Privacy (0.68)
Health & Medicine > Diagnostic Medicine (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.93)

Add feedback

Alternating Projections for Learning with Expectation Constraints

Bellare, Kedar, Druck, Gregory, McCallum, Andrew

arXiv.org Machine LearningMay-9-2012

We present an objective function for learning with unlabeled data that utilizes auxiliary expectation constraints. We optimize this objective function using a procedure that alternates between information and moment projections. Our method provides an alternate interpretation of the posterior regularization framework (Graca et al., 2008), maintains uncertainty during optimization unlike constraint-driven learning (Chang et al., 2007), and is more efficient than generalized expectation criteria (Mann & McCallum, 2008). Applications of this framework include minimally supervised learning, semisupervised learning, and learning with constraints that are more expressive than the underlying model. In experiments, we demonstrate comparable accuracy to generalized expectation criteria for minimally supervised learning, and use expressive structural constraints to guide semi-supervised learning, providing a 3%-6% improvement over stateof-the-art constraint-driven learning.

artificial intelligence, constraint, machine learning, (16 more...)

arXiv.org Machine Learning

1205.266

Country: North America > United States > Massachusetts > Hampshire County > Amherst (0.14)

Genre: Research Report > New Finding (0.46)

Industry: Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.76)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.66)

Add feedback