Implicit Convex Regularizers of CNN Architectures: Convex Optimization of Two- and Three-Layer Networks in Polynomial Time
We study training of Convolutional Neural Networks (CNNs) with ReLU activations and introduce exact convex optimization formulations with a polynomial complexity with respect to the number of data samples, the number of neurons, and data dimension. More specifically, we develop a convex analytic framework utilizing semi-infinite duality to obtain equivalent convex optimization problems for several two- and three-layer CNN architectures. We first prove that two-layer CNNs can be globally optimized via an $\ell_2$ norm regularized convex program. We then show that three-layer CNN training problems are equivalent to an $\ell_1$ regularized convex program that encourages sparsity in the spectral domain. We also extend these results to multi-layer CNN architectures including three-layer networks with two ReLU layers and deeper circular convolutions with a single ReLU layer. Furthermore, we present extensions of our approach to different pooling methods, which elucidates the implicit architectural bias as convex regularizers.
Oct-5-2020
- Country:
- North America
- United States
- New York (0.04)
- California > Santa Clara County
- Canada > Ontario
- Toronto (0.04)
- United States
- Europe > United Kingdom
- England > Cambridgeshire > Cambridge (0.04)
- North America
- Genre:
- Research Report (0.82)
- Technology: