Computational Learning Theory
Neural Networks with Quadratic VC Dimension
Koiran, Pascal, Sontag, Eduardo D.
A set of labeled training samples is provided, and a network must be obtained which is then expected to correctly classify previously unseen inputs. In this context, a central problem is to estimate the amount of training data needed to guarantee satisfactory learning performance. To study this question, it is necessary to first formalize the notion of learning from examples. One such formalization is based on the paradigm of probably approximately correct (PAC) learning, due to Valiant (1984). In this framework, one starts by fitting some function /, chosen from a predetermined class F, to the given training data. The class F is often called the "hypothesis class", and for purposes of this discussion it will be assumed that the functions in F take binary values {O, I} and are defined on a common domain X.
Examples of learning curves from a modified VC-formalism
Kowalczyk, Adam, Szymanski, Jacek, Bartlett, Peter L., Williamson, Robert C.
We examine the issue of evaluation of model specific parameters in a modified VC-formalism. Two examples are analyzed: the 2-dimensional homogeneous perceptron and the I-dimensional higher order neuron. Both models are solved theoretically, and their learning curves are compared against true learning curves. It is shown that the formalism has the potential to generate a variety of learning curves, including ones displaying ''phase transitions."
Neural Networks with Quadratic VC Dimension
Koiran, Pascal, Sontag, Eduardo D.
A set of labeled training samples is provided, and a network must be obtained which is then expected to correctly classify previously unseen inputs. In this context, a central problem is to estimate the amount of training data needed to guarantee satisfactory learning performance. To study this question, it is necessary to first formalize the notion of learning from examples. One such formalization is based on the paradigm of probably approximately correct (PAC) learning, due to Valiant (1984). In this framework, one starts by fitting some function /, chosen from a predetermined class F, to the given training data. The class F is often called the "hypothesis class", and for purposes of this discussion it will be assumed that the functions in F take binary values {O, I} and are defined on a common domain X.
Examples of learning curves from a modified VC-formalism
Kowalczyk, Adam, Szymanski, Jacek, Bartlett, Peter L., Williamson, Robert C.
We examine the issue of evaluation of model specific parameters in a modified VC-formalism. Two examples are analyzed: the 2-dimensional homogeneous perceptron and the I-dimensional higher order neuron. Both models are solved theoretically, and their learning curves are compared againsttrue learning curves. It is shown that the formalism has the potential to generate a variety of learning curves, including ones displaying ''phase transitions."
Neural Networks with Quadratic VC Dimension
Koiran, Pascal, Sontag, Eduardo D.
A set of labeled training samples is provided, and a network must be obtained which is then expected to correctly classify previously unseen inputs. In this context, a central problem is to estimate the amount of training data needed to guarantee satisfactory learning performance. To study this question, it is necessary to first formalize the notion of learning from examples. One such formalization is based on the paradigm of probably approximately correct (PAC) learning, due to Valiant (1984). In this framework, one starts by fitting some function /, chosen from a predetermined class F, to the given training data. The class F is often called the "hypothesis class", and for purposes of this discussion it will be assumed that the functions in F take binary values {O, I} and are defined on a common domain X.
Generalisation in Feedforward Networks
Kowalczyk, Adam, Ferrá, Herman L.
We show that the model provides asignificant improvement on the upper bounds of sample complexity, i.e. the minimal number of random training samples allowing a selection of the hypothesis with a predefined accuracy and confidence. Further, we show that the model has the potential forproviding a finite sample complexity even in the case of infinite VC-dimension as well as for a sample complexity below VC-dimension. This is achieved by linking sample complexity to an "average" number of implementable dichotomies of a training sample rather than the maximal size of a shattered sample, i.e. VC-dimension. 1 Introduction A number offundamental results in computational learning theory [1, 2, 11] links the generalisation error achievable by a set of hypotheses with its Vapnik-Chervonenkis dimension (VC-dimension, for short) which is a sort of capacity measure. They provide in particular some theoretical bounds on the sample complexity, i.e. a minimal number of training samples assuring the desired accuracy with the desired confidence. However there are a few obvious deficiencies in these results: (i) the sample complexity bounds are unrealistically high (c.f. Section 4.), and (ii) for some networks they do not hold at all since VC-dimension is infinite, e.g.
Pac-learning Recursive Logic Programs: Negative Results
In a companion paper it was shown that the class of constant-depth determinate k-ary recursive clauses is efficiently learnable. In this paper we present negative results showing that any natural generalization of this class is hard to learn in Valiant's model of pac-learnability. In particular, we show that the following program classes are cryptographically hard to learn: programs with an unbounded number of constant-depth linear recursive clauses; programs with one constant-depth determinate clause containing an unbounded number of recursive calls; and programs with one linear recursive clause of constant locality. These results immediately imply the non-learnability of any more general class of programs. We also show that learning a constant-depth determinate program with either two linear recursive clauses or one linear recursive clause and one non-recursive clause is as hard as learning boolean DNF. Together with positive results from the companion paper, these negative results establish a boundary of efficient learnability for recursive function-free clauses.