Discontinuous Generalization in Large Committee Machines
–Neural Information Processing Systems
H. Schwarze Dept. of Theoretical Physics Lund University Solvegatan 14A 223 62 Lund Sweden J.Hertz Nordita Blegdamsvej 17 2100 Copenhagen 0 Denmark Abstract The problem of learning from examples in multilayer networks is studied within the framework of statistical mechanics. Using the replica formalism we calculate the average generalization error of a fully connected committee machine in the limit of a large number of hidden units. If the number of training examples is proportional to the number of inputs in the network, the generalization error as a function of the training set size approaches a finite value. If the number of training examples is proportional to the number of weights in the network we find first-order phase transitions with a discontinuous drop in the generalization error for both binary and continuous weights. 1 INTRODUCTION Feedforward neural networks are widely used as nonlinear, parametric models for the solution of classification tasks and function approximation. Trained from examples of a given task, they are able to generalize, i.e. to compute the correct output for new, unknown inputs.
Neural Information Processing Systems
Dec-31-1994