projUNN: efficient method for training deep networks with unitary matrices

Oct-11-2024, 05:55:29 GMT–Neural Information Processing Systems

In learning with recurrent or very deep feed-forward networks, employing unitary matrices in each layer can be very effective at maintaining long-range stability. However, restricting network parameters to be unitary typically comes at the cost of expensive parameterizations or increased training runtime. We propose instead an efficient method based on rank- k updates -- or their rank- k approximation -- that maintains performance at a nearly optimal training runtime. We introduce two variants of this method, named Direct (projUNN-D) and Tangent (projUNN-T) projected Unitary Neural Networks, that can parameterize full N -dimensional unitary or orthogonal matrices with a training runtime scaling as O(kN 2) . Our method either projects low-rank gradients onto the closest unitary matrix (projUNN-T) or transports unitary matrices in the direction of the low-rank gradient (projUNN-D).

efficient method, projunn, unitary matrix, (4 more...)

Neural Information Processing Systems

Oct-11-2024, 05:55:29 GMT

Conferences Web Page

Add feedback

Technology:
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)