Can increasing depth serve to accelerate optimization?
"How does depth help?" is a fundamental question in the theory of deep learning. Conventional wisdom, backed by theoretical studies (e.g. Eldan & Shamir 2016; Raghu et al. 2017; Lee et al. 2017; Cohen et al. 2016; Daniely 2017), holds that adding layers increases expressive power. But often this expressive gain comes at a price –optimization is harder for deeper networks (viz., vanishing/exploding gradients). Recent works on "landscape characterization" implicitly adopt this worldview (e.g.
Mar-5-2018, 12:52:34 GMT
- Technology: