Expected Gradients of Maxout Networks and Consequences to Parameter Initialization

Open in new window