AITopics | Computational Learning Theory

Collaborating Authors

Computational Learning Theory

In computer science, computational learning theory (or just learning theory) is a subfield of Artificial Intelligence devoted to studying the design and analysis of machine learning algorithms (Wikipedia)

News Overviews Instructional Materials AI-Alerts Classics

Local Bandit Approximation for Optimal Learning Problems

Neural Information Processing SystemsApr-6-2023, 18:07:35 GMT

In general, procedures for determining Bayes-optimal adaptive controls for Markov decision processes (MDP's) require a pro(cid:173) hibitive amount of computation-the optimal learning problem is intractable. This paper proposes an approximate approach in which bandit processes are used to model, in a certain "local" sense, a given MDP. Bandit processes constitute an important subclass of MDP's, and have optimal learning strategies (defined in terms of Gittins indices) that can be computed relatively efficiently. Thus, one scheme for achieving approximately-optimal learning for gen(cid:173) eral MDP's proceeds by taking actions suggested by strategies that are optimal with respect to local bandit models.

local bandit approximation, mdp, optimal learning problem

Neural Information Processing Systems

Industry: Education > Focused Education > Special Education (0.68)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (1.00)

Add feedback

Tight Bounds for the VC-Dimension of Piecewise Polynomial Networks

Neural Information Processing SystemsApr-6-2023, 17:42:00 GMT

O(ws(s log d log(dqh/ s))) and O(ws((h/ s) log q) log(dqh/ s)) are upper bounds for the VC-dimension of a set of neural networks of units with piecewise polynomial activation functions, where s is the depth of the network, h is the number of hidden units, w is the number of adjustable parameters, q is the maximum of the number of polynomial segments of the activation function, and d is the maximum degree of the polynomials; also n(wslog(dqh/s)) is a lower bound for the VC-dimension of such a network set, which are tight for the cases s 8(h) and s is constant. For the special case q 1, the VC-dimension is 8(ws log d).

piecewise polynomial network, tight bound, vc-dimension, (1 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.95)

Add feedback

Optimizing Classifers for Imbalanced Training Sets

Neural Information Processing SystemsApr-6-2023, 17:41:23 GMT

Following recent results [9, 8] showing the importance of the fat(cid:173) shattering dimension in explaining the beneficial effect of a large margin on generalization performance, the current paper investi(cid:173) gates the implications of these results for the case of imbalanced datasets and develops two approaches to setting the threshold. The approaches are incorporated into ThetaBoost, a boosting al(cid:173) gorithm for dealing with unequal loss functions. The performance of ThetaBoost and the two approaches are tested experimentally.

cid, imbalanced training set, optimizing classifer

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (0.31)

Add feedback

Almost Linear VC Dimension Bounds for Piecewise Polynomial Networks

Neural Information Processing SystemsApr-6-2023, 17:35:07 GMT

We compute upper and lower bounds on the VC dimension of feedforward networks of units with piecewise polynomial activa(cid:173) tion functions. We show that if the number of layers is fixed, then the VC dimension grows as W log W, where W is the number of parameters in the network.

linear vc dimension bound, piecewise polynomial network

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (1.00)

Add feedback

The Use of MDL to Select among Computational Models of Cognition

Neural Information Processing SystemsApr-6-2023, 17:03:24 GMT

How should we decide among competing explanations of a cognitive process given limited observations? The problem of model selection is at the heart of progress in cognitive science. In this paper, Minimum Description Length (MDL) is introduced as a method for selecting among computational models of cognition. We also show that differential geometry provides an intuitive understanding of what drives model selection in MDL. Finally, adequacy of MDL is demonstrated in two areas of cognitive modeling.

complexity, mdl, model selection, (11 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Cognitive Science > Cognitive Architectures (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory > Minimum Complexity Machines (0.38)

Add feedback

Transfer Learning using Kolmogorov Complexity: Basic Theory and Empirical Evaluations

Neural Information Processing SystemsApr-6-2023, 14:48:49 GMT

In transfer learning we aim to solve new problems using fewer examples using information gained from solving related problems. Transfer learning has been successful in practice, and extensive PAC analysis of these methods has been de- veloped. However it is not yet clear how to define relatedness between tasks. This is considered as a major problem as it is conceptually troubling and it makes it unclear how much information to transfer and when and how to transfer it. In this paper we propose to measure the amount of information one task contains about another using conditional Kolmogorov complexity between the tasks.

basic theory and empirical evaluation, information, kolmogorov complexity, (3 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Transfer Learning (0.95)
Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory > Minimum Complexity Machines (0.70)

Add feedback

Human Rademacher Complexity

Neural Information Processing SystemsApr-6-2023, 14:07:55 GMT

We propose to use Rademacher complexity, originally developed in computational learning theory, as a measure of human learning capacity. Rademacher complexity measures a learners ability to fit random data, and can be used to bound the learners true error based on the observed training sample error. We first review the definition of Rademacher complexity and its generalization bound. We then describe a learning the noise" procedure to experimentally measure human Rademacher complexities. The results from empirical studies showed that: (i) human Rademacher complexity can be successfully measured, (ii) the complexity depends on the domain and training sample size in intuitive ways, (iii) human learning respects the generalization bounds, (iv) the bounds can be useful in predicting the danger of overfitting in human learning. Finally, we discuss the potential applications of human Rademacher complexity in cognitive science."

generalization, human rademacher complexity, rademacher complexity

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (0.66)

Add feedback

Discrete MDL Predicts in Total Variation

Neural Information Processing SystemsApr-6-2023, 14:02:53 GMT

The Minimum Description Length (MDL) principle selects the model that has the shortest code for data plus model. We show that for a countable class of models, MDL predictions are close to the true distribution in a strong sense. The result is completely general. No independence, ergodicity, stationarity, identifiability, or other assumption on the model class need to be made. More formally, we show that for any countable class of models, the distributions selected by MDL (or MAP) asymptotically predict (merge with) the true measure in the class in total variation distance.

countable class, discrete mdl predict, total variation

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory > Minimum Complexity Machines (0.70)

Add feedback

Regularized Distance Metric Learning:Theory and Algorithm

Neural Information Processing SystemsApr-6-2023, 13:58:50 GMT

In this paper, we examine the generalization error of regularized distance metric learning. We show that with appropriate constraints, the generalization error of regularized distance metric learning could be independent from the dimensionality, making it suitable for handling high dimensional data. In addition, we present an efficient online learning algorithm for regularized distance metric learning. Our empirical studies with data classification and face recognition show that the proposed algorithm is (i) effective for distance metric learning when compared to the state-of-the-art methods, and (ii) efficient and robust for high dimensional data.

distance metric learning, regularized distance metric learning, theory and algorithm, (2 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (0.40)
Information Technology > Artificial Intelligence > Vision > Face Recognition (0.33)

Add feedback

Potential-Based Agnostic Boosting

Neural Information Processing SystemsApr-6-2023, 13:46:32 GMT

We prove strong noise-tolerance properties of a potential-based boosting algorithm, similar to MadaBoost (Domingo and Watanabe, 2000) and SmoothBoost (Servedio, 2003). Our analysis is in the agnostic framework of Kearns, Schapire and Sellie (1994), giving polynomial-time guarantees in presence of arbitrary noise. A remarkable feature of our algorithm is that it can be implemented without reweighting examples, by randomly relabeling them instead. Our boosting theorem gives, as easy corollaries, alternative derivations of two recent non-trivial results in computational learning theory: agnostically learning decision trees (Gopalan et al, 2008) and agnostically learning halfspaces (Kalai et al, 2005). Experiments suggest that the algorithm performs similarly to Madaboost.

agnostically, algorithm

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (0.69)

Add feedback