High-dimensional Asymptotics of Denoising Autoencoders
–arXiv.org Artificial Intelligence
Machine learning techniques have a long history of success in denoising tasks. The recent breakthrough of diffusionbased generation [1, 2] has further revived the interest in denoising networks, demonstrating how they can also be leveraged, beyond denoising, for generative tasks. However, this rapidly expanding range of applications stands in sharp contrast to the relatively scarce theoretical understanding of denoising neural networks, even for the simplest instance thereof - namely Denoising Auto Encoders (DAEs) [3]. Theoretical studies of autoencoders have hitherto almost exclusively focused on data compression tasks using Reconstruction Auto Encoders (RAEs), where the goal is to learn a concise latent representation of the data. A majority of this body of work addresses linear autoencoders [4-7]. The authors of [8, 9] analyze the gradient-based training of non-linear autoencoders with online stochastic gradient descent or in population, thus implicitly assuming the availability of an infinite number of training samples. Furthermore, two-layer RAEs were shown to learn to essentially perform Principal Component Analysis (PCA) [10-12], i.e. to learn a linear model. Ref. [13] shows that this is also true for infinite-width architectures. Learning in DAEs has been the object of theoretical investigations only in the linear case [14], while the case of non-linear DAEs remains theoretically largely unexplored.
arXiv.org Artificial Intelligence
May-18-2023
- Country:
- North America > United States (0.14)
- Europe
- Switzerland > Vaud
- Lausanne (0.04)
- Germany > Bavaria
- Upper Bavaria > Munich (0.04)
- Switzerland > Vaud
- Genre:
- Research Report (0.50)
- Industry:
- Health & Medicine (0.45)
- Technology: