Attention-Gated Brain Propagation: How the brain can implement reward-based error backpropagation

Neural Information Processing Systems 

The network chooses an action by selecting a unit in the output layer and uses feedback connections to assign credit to the units in successively lower layers that are responsible for this action.