Goto

Collaborating Authors

 spectral norm regularization


Spectral Norm Regularization of Orthonormal Representations for Graph Transduction

Neural Information Processing Systems

Recent literature~\cite{ando} suggests that embedding a graph on an unit sphere leads to better generalization for graph transduction. However, the choice of optimal embedding and an efficient algorithm to compute the same remains open. In this paper, we show that orthonormal representations, a class of unit-sphere graph embeddings are PAC learnable. Existing PAC-based analysis do not apply as the VC dimension of the function class is infinite. We propose an alternative PAC-based bound, which do not depend on the VC dimension of the underlying function class, but is related to the famous Lov\'{a}sz~$\vartheta$ function.


Spectral Norm Regularization of Orthonormal Representations for Graph Transduction

Shivanna, Rakesh, Chatterjee, Bibaswan K., Sankaran, Raman, Bhattacharyya, Chiranjib, Bach, Francis

Neural Information Processing Systems

Recent literature \cite{ando} suggests that embedding a graph on an unit sphere leads to better generalization for graph transduction. However, the choice of optimal embedding and an efficient algorithm to compute the same remains open. In this paper, we show that orthonormal representations, a class of unit-sphere graph embeddings are PAC learnable. Existing PAC-based analysis do not apply as the VC dimension of the function class is infinite. We propose an alternative PAC-based bound, which do not depend on the VC dimension of the underlying function class, but is related to the famous Lov\'{a}sz $\vartheta$ function.


Adversarial Training Generalizes Data-dependent Spectral Norm Regularization

Roth, Kevin, Kilcher, Yannic, Hofmann, Thomas

arXiv.org Machine Learning

We establish a theoretical link between adversarial training and operator norm regularization for deep neural networks. Specifically, we show that adversarial training is a data-dependent generalization of spectral norm regularization. This intriguing connection provides fundamental insights into the origin of adversarial vulnerability and hints at novel ways to robustify and defend against adversarial attacks. We provide extensive empirical evidence to support our theoretical results.


Lipschitz-Margin Training: Scalable Certification of Perturbation Invariance for Deep Neural Networks

Tsuzuku, Yusuke, Sato, Issei, Sugiyama, Masashi

arXiv.org Machine Learning

This indicates that even protected networks can be unexpectedly vulnerable. This is a crucial problem for this specific line of research because the primary concern of these studies are security threats. To tackle this crucial problem, we aim to develop defense methods with theoretical guarantees. Our goal is to ensure the lower bounds on the size of adversarial perturbations that networks can never be deceived for each input. We refer to these lower bounds as certified invariant radii, or simply, invariant radii. To make them available in broad applications, there are two fundamental requirements to their calculation methods: 1. the minimality of assumptions on network structures, 2. the computational tractability. However, many existing approaches require strong assumptions and massive computational costs. For example, we could not ensure perturbation invariance for some network structures such as wide residual networks [42], which have been commonly used in the evaluations of defense methods. This work tackled this problem and we provide a widely applicable, yet, highly scalable method to ensure large invariant radii. Our basic idea is to bound the size of adversarial perturbations that networks can never be deceived Even though the concept of using the Lipschitz constant has already appeared in Szegedy et al. [37], how much certifications they can provide has not been studied well. We show we can ensure significantly larger invariant radii compared to a recent computationally efficient counterpart [32]. However, the size of certified invariant radii can still be insufficient to be practically meaningful in some cases. We addressed this issue with a novel training procedure that further strengthen perturbation invariance.


Spectral Norm Regularization for Improving the Generalizability of Deep Learning

Yoshida, Yuichi, Miyato, Takeru

arXiv.org Machine Learning

We investigate the generalizability of deep learning based on the sensitivity to input perturbation. We hypothesize that the high sensitivity to the perturbation of data degrades the performance on it. To reduce the sensitivity to perturbation, we propose a simple and effective regularization method, referred to as spectral norm regularization, which penalizes the high spectral norm of weight matrices in neural networks. We provide supportive evidence for the abovementioned hypothesis by experimentally confirming that the models trained using spectral norm regularization exhibit better generalizability than other baseline methods.