Globally Gated Deep Linear Networks

Neural Information Processing Systems 

Interestingly, networks with a large number of gating units behave similarly to standard ReLU architectures.