Nonlinear Computation in Deep Linear Networks
We've shown that deep linear networks -- as implemented using floating-point arithmetic -- are not actually linear and can perform nonlinear computation. We used evolution strategies to find parameters in linear networks that exploit this trait, letting us solve non-trivial problems. Neural networks consist of stacks of a linear layer followed by a nonlinearity like tanh or rectified linear unit. Without the nonlinearity, consecutive linear layers would be in theory mathematically equivalent to a single linear layer. So it's a surprise that floating point arithmetic is nonlinear enough to yield trainable deep networks. Numbers used by computers aren't perfect mathematical objects, but approximate representations using finite numbers of bits.
Sep-30-2017, 14:25:23 GMT
- Technology: