Goto

Collaborating Authors

 Computational Learning Theory




Estimating Average-Case Learning Curves Using Bayesian, Statistical Physics and VC Dimension Methods

Neural Information Processing Systems

In this paper we investigate an average-case model of concept learning, and give results that place the popular statistical physics and VC dimension theories of learning curve behavior in a common framework.




A training algorithm for optimum margin classifiers

Classics

Proceedings of the Fifth Annual Workshop on Computational Learning Theory 5: 144-152, 1992.


Learning Time-varying Concepts

Neural Information Processing Systems

This work extends computational learning theory to situations in which concepts vary over time, e.g., system identification of a time-varying plant. We have extended formal definitions of concepts and learning to provide a framework in which an algorithm can track a concept as it evolves over time. Given this framework and focusing on memory-based algorithms, we have derived some PACstyle sample complexity results that determine, for example, when tracking is feasible. We have also used a similar framework and focused on incremental tracking algorithms for which we have derived some bounds on the mistake or error rates for some specific concept classes. 1 INTRODUCTION The goal of our ongoing research is to extend computational learning theory to include concepts that can change or evolve over time. For example, face recognition is complicated by the fact that a persons face changes slowly with age and more quickly with changes in make up, hairstyle, or facial hair.


Learning Time-varying Concepts

Neural Information Processing Systems

This work extends computational learning theory to situations in which concepts vary over time, e.g., system identification of a time-varying plant. We have extended formal definitions of concepts and learning to provide a framework in which an algorithm can track a concept as it evolves over time. Given this framework and focusing on memory-based algorithms, we have derived some PACstyle sample complexity results that determine, for example, when tracking is feasible. We have also used a similar framework and focused on incremental tracking algorithms for which we have derived some bounds on the mistake or error rates for some specific concept classes. 1 INTRODUCTION The goal of our ongoing research is to extend computational learning theory to include concepts that can change or evolve over time. For example, face recognition is complicated by the fact that a persons face changes slowly with age and more quickly with changes in make up, hairstyle, or facial hair.


Learning Theory and Experiments with Competitive Networks

Neural Information Processing Systems

Van den Bout North Carolina State University Box 7914 Raleigh, NC 27695-7914 We apply the theory of Tishby, Levin, and Sol1a (TLS) to two problems. First we analyze an elementary problem for which we find the predictions consistent with conventional statistical results. Second we numerically examine the more realistic problem of training a competitive net to learn a probability density from samples. We find TLS useful for predicting average training behavior. . 1 TLS APPLIED TO LEARNING DENSITIES Recently a theory of learning has been constructed which describes the learning of a relation from examples (Tishby, Levin, and Sol1a, 1989), (Schwarb, Samalan, Sol1a, and Denker, 1990). The original derivation relies on a statistical mechanics treatment of the probability of independent events in a system with a specified average value of an additive error function. The resulting theory is not restricted to learning relations and it is not essentially statistical mechanical. The TLS theory can be derived from the principle of mazimum entropy,a general inference tool which produces probabilities characterized by certain values of the averages of specified functions(Jaynes, 1979). A TLS theory can be constructed whenever the specified function is additive and associated with independent examples. In this paper we treat the problem of learning a probability density from samples.