to

### GEMINI: Gradient Estimation Through Matrix Inversion After Noise Injection

Learning procedures that measure how random perturbations of unit activities correlate with changes in reinforcement are inefficient but simple to implement in hardware. Procedures like back-propagation (Rumelhart, Hinton and Williams, 1986) which compute how changes in activities affect the output error are much more efficient, but require more complex hardware. GEMINI is a hybrid procedure for multilayer networks, which shares many of the implementation advantages of correlational reinforcement procedures but is more efficient. GEMINI injects noise only at the first hidden layer and measures the resultant effect on the output error. A linear network associated with each hidden layer iteratively inverts the matrix which relates the noise to the error change, thereby obtaining the error-derivatives. No back-propagation is involved, thus allowing unknown non-linearities in the system. Two simulations demonstrate the effectiveness of GEMINI.

### GEMINI: Gradient Estimation Through Matrix Inversion After Noise Injection

Learning procedures that measure how random perturbations of unit activities correlatewith changes in reinforcement are inefficient but simple to implement in hardware. Procedures like back-propagation (Rumelhart, Hinton and Williams, 1986) which compute how changes in activities affect theoutput error are much more efficient, but require more complex hardware. GEMINI is a hybrid procedure for multilayer networks, which shares many of the implementation advantages of correlational reinforcement proceduresbut is more efficient. GEMINI injects noise only at the first hidden layer and measures the resultant effect on the output error. A linear network associated with each hidden layer iteratively inverts the matrix which relates the noise to the error change, thereby obtaining the error-derivatives. No back-propagation is involved, thus allowing unknown non-linearitiesin the system. Two simulations demonstrate the effectiveness of GEMINI.

### Learning Representations by Recirculation

One criticism of back-propagation is that it requires a teacher to specify the desired output vectors. It is possible to dispense with the teacher in the case of "encoder" networks2 in which the desired output vector is identical with the input vector (see Figure 1). The purpose of an encoder network is to learn good "codes" in the intermediate, hidden units. If for, example, there are less hidden units than input units, an encoder network will perform data-compression3.

### Learning Representations by Recirculation

One criticism of back-propagation is that it requires a teacher to specify the desired output vectors. It is possible to dispense with the teacher in the case of "encoder" networks

### A Deep Learning Tutorial: From Perceptrons to Deep Networks

We have some algorithm that's given a handful of labeled examples, say 10 images of dogs with the label 1 ("Dog") and 10 images of other things with the label 0 ("Not dog")--note that we're mainly sticking to supervised, binary classification for this post. The algorithm "learns" to identify images of dogs and, when fed a new image, hopes to produce the correct label (1 if it's an image of a dog, and 0 otherwise). We have some algorithm that's given a handful of labeled examples, say 10 images of dogs with the label 1 ("Dog") and 10 images of other things with the label 0 ("Not dog")--note that we're mainly sticking to supervised, binary classification for this post. The algorithm "learns" to identify images of dogs and, when fed a new image, hopes to produce the correct label (1 if it's an image of a dog, and 0 otherwise). This setting is incredibly general: your data could be symptoms and your labels illnesses; or your data could be images of handwritten characters and your labels the actual characters they represent.