Kinoshita, Yuri
A provable control of sensitivity of neural networks through a direct parameterization of the overall bi-Lipschitzness
Kinoshita, Yuri, Toyoizumi, Taro
While neural networks can enjoy an outstanding flexibility and exhibit unprecedented performance, the mechanism behind their behavior is still not well-understood. To tackle this fundamental challenge, researchers have tried to restrict and manipulate some of their properties in order to gain new insights and better control on them. Especially, throughout the past few years, the concept of \emph{bi-Lipschitzness} has been proved as a beneficial inductive bias in many areas. However, due to its complexity, the design and control of bi-Lipschitz architectures are falling behind, and a model that is precisely designed for bi-Lipschitzness realizing a direct and simple control of the constants along with solid theoretical analysis is lacking. In this work, we investigate and propose a novel framework for bi-Lipschitzness that can achieve such a clear and tight control based on convex neural networks and the Legendre-Fenchel duality. Its desirable properties are illustrated with concrete experiments. We also apply this framework to uncertainty estimation and monotone problem settings to illustrate its broad range of applications.
Controlling Posterior Collapse by an Inverse Lipschitz Constraint on the Decoder Network
Kinoshita, Yuri, Oono, Kenta, Fukumizu, Kenji, Yoshida, Yuichi, Maeda, Shin-ichi
However, While VAEs are nowadays omnipresent in the field of machine in practice, they suffer from a problem called learning, it is also widely recognized that there remain posterior collapse, which occurs when the encoder in practice some major challenges that still require effective coincides, or collapses, with the prior taking no solutions. Notably, they suffer from the problem of information from the latent structure of the input posterior collapse, which occurs when the distribution corresponding data into consideration. In this work, we introduce to the encoder coincides, or collapses, with the an inverse Lipschitz neural network into the prior taking no information from the latent structure of the decoder and, based on this architecture, provide a input data into consideration. Also known as KL vanishing new method that can control in a simple and clear or over-pruning, this phenomenon makes VAEs incapable manner the degree of posterior collapse for a wide to produce pertinent representations and has been reportedly range of VAE models equipped with a concrete observed in many fields (e.g., Bowman et al. (2016); Fu et al. theoretical guarantee. We also illustrate the effectiveness (2019); Wang & Ziyin (2022); Yeung et al. (2017)). There of our method through several numerical exists now a large body of literature that examines its underlying experiments.