Quickprop: an almost forgotten neural training algorithm • r/MachineLearning
You must define the global loss L and then consider only the gradient relative to the connection between neuron i and neuron j for each layer. For example: if you have 3 layers (0, 1, 2) and 3 is an output neuron, you must consider the total loss L (computed with the output) and for each update, let's say Layer1-Neuron3 and Layer 2-Neuron2, you have to compute the gradient dL/dW13-22 (relative to the synaptic weight that connects the neurons). I know it's difficult to write here, however the process is very simple.
Sep-15-2017, 17:15:07 GMT
- Technology: