Network Generality, Training Required, and Precision Required

Neural Information Processing Systems 

We show how to estimate (1) the number of functions that can be implemented by a particular network architecture, (2) how much analog precision is needed in the con(cid:173) nections in the network, and (3) the number of training examples the network must see before it can be expected to form reliable generalizations. Consider the following objectives: First, the network should be very powerful and ver(cid:173) satile, i.e., it should implement any function (truth table) you like, and secondly, it should learn easily, forming meaningful generalizations from a small number of training examples. Well, it is information-theoretically impossible to create such a network. We will present here a simplified argument; a more complete and sophisticated version can be found in Denker et al. (1987). It is customary to regard learning as a dynamical process: adjusting the weights (etc.) in a single network.