Nuclear Norm Regularization for Deep Learning
Scarvelis, Christopher, Solomon, Justin
Penalizing the nuclear norm of a function's Jacobian encourages it to locally behave like a low-rank linear map. Such functions vary locally along only a handful of directions, making the Jacobian nuclear norm a natural regularizer for machine learning problems. However, this regularizer is intractable for high-dimensional problems, as it requires computing a large Jacobian matrix and taking its singular value decomposition. We show how to efficiently penalize the Jacobian nuclear norm using techniques tailor-made for deep learning. We prove that for functions parametrized as compositions $f = g \circ h$, one may equivalently penalize the average squared Frobenius norm of $Jg$ and $Jh$. We then propose a denoising-style approximation that avoids the Jacobian computations altogether. Our method is simple, efficient, and accurate, enabling Jacobian nuclear norm regularization to scale to high-dimensional deep learning problems. We complement our theory with an empirical study of our regularizer's performance and investigate applications to denoising and representation learning.
May-23-2024
- Country:
- Europe > Germany
- Bavaria > Upper Bavaria > Munich (0.04)
- North America
- Canada (0.04)
- United States
- Massachusetts > Middlesex County
- Cambridge (0.04)
- New York > New York County
- New York City (0.04)
- Massachusetts > Middlesex County
- Europe > Germany
- Genre:
- Research Report (0.50)
- Industry:
- Education > Focused Education > Special Education (0.45)
- Technology: