Coupling-based Invertible Neural Networks Are Universal Diffeomorphism Approximators
Teshima, Takeshi, Ishikawa, Isao, Tojo, Koichi, Oono, Kenta, Ikeda, Masahiro, Sugiyama, Masashi
Invertible neural networks based on coupling flows (CF-INNs) are neural network architectures with invertibility by design [1, 2]. Endowed with the analytic-form invertibility and the tractability of the Jacobian, CF-INNs have demonstrated their usefulness in various machine learning tasks such as generative modeling [3-7], probabilistic inference [8-10], solving inverse problems [11], and feature extraction and manipulation [4, 12-14]. The attractive properties of CF-INNs come at the cost of potential restrictions on the set of functions that they can approximate because they rely on carefully designed network layers. To circumvent the potential drawback, a variety of layer designs have been proposed to construct CF-INNs with high representation power, e.g., the affine coupling flow [3, 4, 15-17], the neural autoregressive flow [18-20], and the polynomial flow [21], each demonstrating enhanced empirical performance. Despite the diversity of layer designs [1, 2], the theoretical understanding of the representation power of CF-INNs has been limited. Indeed, the most basic property as a function approximator, namely the universal approximation property (or universality for short) [22], has not been elucidated for CF-INNs. The universality can be crucial when CF-INNs are used to learn an invertible transformation (e.g., feature extraction [12] or independent component analysis [14]) because, informally speaking, lack of universality implies that there exists an invertible transformation, even among well-behaved ones, that CF-INN can never approximate, and it would render the model class unreliable for the task of function approximation.
Nov-3-2020