Generalization in Deep Networks: The Role of Distance from Initialization

Open in new window