Metaphors We Learn By
–arXiv.org Artificial Intelligence
Gradient based learning using error back-propagation ("backprop") is a wellknown contributor to much of the recent progress in AI. A less obvious, but arguably equally important, ingredient is parameter sharing - most well-known in the context of convolutional networks. In this essay we relate parameter sharing ("weight sharing") to analogy making and the school of thought of cognitive metaphor. We discuss how recurrent and auto-regressive models can be thought of as extending analogy making from static features to dynamic skills and procedures. We also discuss corollaries of this perspective, for example, how it can challenge the currently entrenched dichotomy between connectionist and "classic" rule-based views of computation. It is well-known that neural networks, regardless whether training is supervised or self-supervised, require large amounts of training data to work well. To ensure generalization, one can maximize the number of training examples, minimize the number of tunable parameters, or do both. Parameter sharing is a common principle to reduce the number of tunable parameters without having to reduce the number of actual parameters (synaptic connections) in the network. In fact, it is hard to find any neural network architecture in the literature, that does not make use of parameter sharing in some way.
arXiv.org Artificial Intelligence
Nov-17-2022