Phase diagram of early training dynamics in deep networks: effect of the learning rate, depth, and width

Open in new window