On the interplay of network structure and gradient convergence in deep learning

Open in new window