AITopics | Computational Learning Theory

Collaborating Authors

Computational Learning Theory

In computer science, computational learning theory (or just learning theory) is a subfield of Artificial Intelligence devoted to studying the design and analysis of machine learning algorithms (Wikipedia)

News Overviews Instructional Materials AI-Alerts Classics

Active and passive learning of linear separators under log-concave distributions

Balcan, Maria Florina, Long, Philip M.

arXiv.org Machine LearningApr-26-2013

We provide new results concerning label efficient, polynomial time, passive and active learning of linear separators. We prove that active learning provides an exponential improvement over PAC (passive) learning of homogeneous linear separators under nearly log-concave distributions. Building on this, we provide a computationally efficient PAC algorithm with optimal (up to a constant factor) sample complexity for such problems. This resolves an open question concerning the sample complexity of efficient PAC algorithms under the uniform distribution in the unit ball. Moreover, it provides the first bound for a polynomial-time PAC algorithm that is tight for an interesting infinite class of hypothesis functions under a general and natural class of data-distributions, providing significant progress towards a longstanding open question. We also provide new bounds for active and passive learning in the case that the data might not be linearly separable, both in the agnostic case and and under the Tsybakov low-noise condition. To derive our results, we provide new structural results for (nearly) log-concave distributions, which might be of independent interest as well.

artificial intelligence, log-concave distribution, survey article, (15 more...)

arXiv.org Machine Learning

1211.1082

Genre: Research Report > New Finding (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (0.47)

Add feedback

Parsimonious module inference in large networks

Peixoto, Tiago P.

arXiv.org Machine LearningApr-8-2013

We investigate the detectability of modules in large networks when the number of modules is not known in advance. We employ the minimum description length (MDL) principle which seeks to minimize the total amount of information required to describe the network, and avoid overfitting. According to this criterion, we obtain general bounds on the detectability of any prescribed block structure, given the number of nodes and edges in the sampled network. We also obtain that the maximum number of detectable blocks scales as $\sqrt{N}$, where $N$ is the number of nodes in the network, for a fixed average degree $$. We also show that the simplicity of the MDL approach yields an efficient multilevel Monte Carlo inference algorithm with a complexity of $O(\tau N\log N)$, if the number of blocks is unknown, and $O(\tau N)$ if it is known, where $\tau$ is the mixing time of the Markov chain. We illustrate the application of the method on a large network of actors and films with over $10^6$ edges, and a dissortative, bipartite block structure.

artificial intelligence, block structure, machine learning, (19 more...)

arXiv.org Machine Learning

doi: 10.1103/PhysRevLett.110.148701

1212.4794

Country:

North America > United States (0.38)
Asia > Middle East (0.28)
Europe > Germany > Bremen > Bremen (0.14)

Genre: Research Report (0.50)

Industry:

Media > Film (0.68)
Leisure & Entertainment > Sports (0.46)
Media > Music (0.46)

Technology:

Information Technology > Communications > Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory > Minimum Complexity Machines (0.88)

Add feedback

Learning Theory Approach to Minimum Error Entropy Criterion

Hu, Ting, Fan, Jun, Wu, Qiang, Zhou, Ding-Xuan

arXiv.org Machine LearningFeb-22-2013

We consider the minimum error entropy (MEE) criterion and an empirical risk minimization learning algorithm in a regression setting. A learning theory approach is presented for this MEE algorithm and explicit error bounds are provided in terms of the approximation ability and capacity of the involved hypothesis space when the MEE scaling parameter is large. Novel asymptotic analysis is conducted for the generalization error associated with Renyi's entropy and a Parzen window function, to overcome technical difficulties arisen from the essential differences between the classical least squares problems and the MEE setting. A semi-norm and the involved symmetrized least squares error are introduced, which is related to some ranking algorithms.

algorithm, artificial intelligence, machine learning, (15 more...)

arXiv.org Machine Learning

1208.0848

Country: Asia > China > Hong Kong (0.14)

Genre: Research Report (0.64)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (0.61)

Add feedback

Efficient Partial Order CDCL Using Assertion Level Choice Heuristics

Monnet, Anthony, Villemaire, Roger

arXiv.org Artificial IntelligenceJan-31-2013

We previously designed Partial Order Conflict Driven Clause Learning (PO-CDCL), a variation of the satisfiability solving CDCL algorithm with a partial order on decision levels, and showed that it can speed up the solving on problems with a high independence between decision levels. In this paper, we more thoroughly analyze the reasons of the efficiency of PO-CDCL. Of particular importance is that the partial order introduces several candidates for the assertion level. By evaluating different heuristics for this choice, we show that the assertion level selection has an important impact on solving and that a carefully designed heuristic can significantly improve performances on relevant benchmarks.

assertion level, constraint-based reasoning, logic programming, (20 more...)

arXiv.org Artificial Intelligence

1301.7676

Country: North America > Canada > Quebec (0.14)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (0.49)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.48)
Information Technology > Artificial Intelligence > Representation & Reasoning > Constraint-Based Reasoning (0.46)
Information Technology > Artificial Intelligence > Representation & Reasoning > Logic & Formal Reasoning (0.46)

Add feedback

Minimum Encoding Approaches for Predictive Modeling

Grunwald, Peter D, Kontkanen, Petri, Myllymaki, Petri, Silander, Tomi, Tirri, Henry

arXiv.org Machine LearningJan-30-2013

We analyze differences between two information-theoretically motivated approaches to statistical inference and model selection: the Minimum Description Length (MDL) principle, and the Minimum Message Length (MML) principle. Based on this analysis, we present two revised versions of MML: a pointwise estimator which gives the MML-optimal single parameter model, and a volumewise estimator which gives the MML-optimal region in the parameter space. Our empirical results suggest that with small data sets, the MDL approach yields more accurate predictions than the MML estimators. The empirical results also demonstrate that the revised MML estimators introduced here perform better than the original MML estimator suggested by Wallace and Freeman.

artificial intelligence, bayesian inference, estimator, (16 more...)

arXiv.org Machine Learning

1301.7378

Country:

Europe (0.68)
North America > United States > New York (0.14)

Genre: Research Report > New Finding (0.48)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory > Minimum Complexity Machines (1.00)

Add feedback

Multiclass Learning Approaches: A Theoretical Comparison with Implications

Daniely, Amit, Sabato, Sivan, Shwartz, Shai S.

Neural Information Processing SystemsDec-31-2012

We theoretically analyze and compare the following five popular multiclass classification methods: One vs. All, All Pairs, Tree-based classifiers, Error Correcting Output Codes (ECOC) with randomly generated code matrices, and Multiclass SVM. In the first four methods, the classification is based on a reduction to binary classification. We consider the case where the binary classifier comes from a class of VC dimension $d$, and in particular from the class of halfspaces over $\reals^d$. We analyze both the estimation error and the approximation error of these methods. Our analysis reveals interesting conclusions of practical relevance, regarding the success of the different approaches under various conditions. Our proof technique employs tools from VC theory to analyze the \emph{approximation error} of hypothesis classes. This is in sharp contrast to most, if not all, previous uses of VC theory, which only deal with estimation error.

approximation error, artificial intelligence, machine learning, (17 more...)

Neural Information Processing Systems

Country:

North America > United States (0.28)
Asia > Middle East > Israel (0.14)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (0.36)

Add feedback

Learning Multiple Tasks using Shared Hypotheses

Crammer, Koby, Mansour, Yishay

Neural Information Processing SystemsDec-31-2012

In this work we consider a setting where we have a very large number of related tasks with few examples from each individual task. Rather than either learning each task individually (and having a large generalization error) or learning all the tasks together using a single hypothesis (and suffering a potentially large inherent error), we consider learning a small pool of {\em shared hypotheses}. Each task is then mapped to a single hypothesis in the pool (hard association). We derive VC dimension generalization bounds for our model, based on the number of tasks, shared hypothesis and the VC dimension of the hypotheses class. We conducted experiments with both synthetic problems and sentiment of reviews, which strongly support our approach.

algorithm, artificial intelligence, machine learning, (15 more...)

Neural Information Processing Systems

Country: Asia > Middle East > Israel (0.28)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (0.72)

Add feedback

SMML estimators for 1-dimensional continuous data

Dowty, James G.

arXiv.org Machine LearningDec-19-2012

A method is given for calculating the strict minimum message length (SMML) estimator for 1-dimensional exponential families with continuous sufficient statistics. A set of $n$ equations are found that the $n$ cut-points of the SMML estimator must satisfy. These equations can be solved using Newton's method and this approach is used to produce new results and to replicate results that C. S. Wallace obtained using his boundary rules for the SMML estimator. A rigorous proof is also given that, despite being composed of step functions, the posterior probability corresponding to the SMML estimator is a continuous function of the data.

artificial intelligence, machine learning, smml estimator, (17 more...)

arXiv.org Machine Learning

1212.4906

Genre: Research Report > New Finding (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory > Minimum Complexity Machines (0.35)

Add feedback

Almost-everywhere algorithmic stability and generalization error

Kutin, Samuel, Niyogi, Partha

arXiv.org Machine LearningDec-12-2012

We explore in some detail the notion of algorithmic stability as a viable framework for analyzing the generalization error of learning algorithms. We introduce the new notion of training stability of a learning algorithm and show that, in a general setting, it is sufficient for good bounds on generalization error. In the PAC setting, training stability is both necessary and sufficient for learnability.\ The approach based on training stability makes no reference to VC dimension or VC entropy. There is no need to prove uniform convergence, and generalization error is bounded directly via an extended McDiarmid inequality. As a result it potentially allows us to deal with a broader class of learning algorithms than Empirical Risk Minimization. \ We also explore the relationships among VC dimension, generalization error, and various notions of stability. Several examples of learning algorithms are considered.

artificial intelligence, evolutionary algorithm, stability, (18 more...)

arXiv.org Machine Learning

1301.0579

Country: North America > United States (0.15)

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (0.69)

Add feedback

On Information Regularization

Corduneanu, Adrian, Jaakkola, Tommi S.

arXiv.org Machine LearningOct-19-2012

We formulate a principle for classification with the knowledge of the marginal distribution over the data points (unlabeled data). The principle is cast in terms of Tikhonov style regularization where the regularization penalty articulates the way in which the marginal density should constrain otherwise unrestricted conditional distributions. Specifically, the regularization penalty penalizes any information introduced between the examples and labels beyond what is provided by the available labeled examples. The work extends Szummer and Jaakkola's information regularization (NIPS 2002) to multiple dimensions, providing a regularizer independent of the covering of the space used in the derivation. We show in addition how the information regularizer can be used as a measure of complexity of the classification task with unlabeled data and prove a relevant sample-complexity bound. We illustrate the regularization principle in practice by restricting the class of conditional distributions to be logistic regression models and constructing the regularization penalty from a finite set of unlabeled examples.

artificial intelligence, machine learning, regularizer, (16 more...)

arXiv.org Machine Learning

1212.2466

Country:

North America > United States (0.14)
North America > Canada > Ontario > Toronto (0.14)

Genre: Research Report > Experimental Study (0.36)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.56)
Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (0.47)

Add feedback