Neural Path Features and Neural Path Kernel : Understanding the role of gates in deep learning

Neural Information Processing Systems 

Rectified linear unit (ReLU) activations can also be thought of as'gates', which, either pass or stop their pre-activation input when they are'on' (when the pre-activation input is positive) or'off' (when the pre-activation input is negative) respectively. A deep neural network (DNN) with ReLU activations has many gates, and the on/off status of each gate changes across input examples as well as network weights. For a given input example, only a subset of gates are'active', i.e., on, and the sub-network of weights connected to these active gates is responsible for producing the output. At randomised initialisation, the active sub-network corresponding to a given input example is random. During training, as the weights are learnt, the active sub-networks are also learnt, and could hold valuable information.