For Valid Generalization the Size of the Weights is More Important than the Size of the Network