real nvp
8 SupplementaryMaterial
For the GLOW experiment we stacked three GLOW transformations at different scales eachwitheightaffinecoupling blocks spaced byactnorms andpermutations each parameterized byaCNN with twohidden layers with 512 filters each. In a recent arXiv submission, Arjovsky et al.[2] suggested that in the presence of an observable variability intheenvironmente(e.g. While this procedure workedondistributions that were very similar tobegin with, inthe majority of cases the log-likelihood fit toB did not provide informative gradients when evaluated on the transformed dataset, as the KL-divergence between distributions with disjoint supports is infinite. The code is available in lrmf_gradient_simulation.ipynb. LRMF objective(Eq 2) decreases over time and reaches zero when two datasets are aligned.
8 Supplementary Material
Attached IPython notebooks were tested to work as expected in Colab. On replacing the Gaussian prior with a learned density in normalizing flows. As mentioned in the main paper, FFJORD LRMF performed on par with Real NVP version. The dynamics can be found in the Figure 10. Rightmost column shows LRMF convergence.
Enhancing Obsolescence Forecasting with Deep Generative Data Augmentation: A Semi-Supervised Framework for Low-Data Industrial Applications
Saad, Elie, Besbes, Mariem, Zolghadri, Marc, Czmil, Victor, Baron, Claude, Bourgeois, Vincent
The challenge of electronic component obsolescence is particularly critical in systems with long life cycles. Various obsolescence management methods are employed to mitigate its impact, with obsolescence forecasting being a highly sought-after and prominent approach. As a result, numerous machine learning-based forecasting methods have been proposed. However, machine learning models require a substantial amount of relevant data to achieve high precision, which is lacking in the current obsolescence landscape in some situations. This work introduces a novel framework for obsolescence forecasting based on deep learning. The proposed framework solves the lack of available data through deep generative modeling, where new obsolescence cases are generated and used to augment the training dataset. The augmented dataset is then used to train a classical machine learning-based obsolescence forecasting model. To train classical forecasting models using augmented datasets, existing classical supervised-learning classifiers are adapted for semi-supervised learning within this framework. The proposed framework demonstrates state-of-the-art results on benchmarking datasets.
Masked Autoregressive Flow for Density Estimation
George Papamakarios, Iain Murray, Theo Pavlakou
Autoregressive models are among the best performing neural density estimators. We describe an approach for increasing the flexibility of an autoregressive model, based on modelling the random numbers that the model uses internally when generating data. By constructing a stack of autoregressive models, each modelling the random numbers of the next model in the stack, we obtain a type of normalizing flow suitable for density estimation, which we call Masked Autoregressive Flow. This type of flow is closely related to Inverse Autoregressive Flow and is a generalization of Real NVP. Masked Autoregressive Flow achieves state-of-the-art performance in a range of general-purpose density estimation tasks.
Deep Neural Networks with Symplectic Preservation Properties
We propose a deep neural network architecture designed such that its output forms an invertible symplectomorphism of the input. This design draws an analogy to the real-valued non-volume-preserving (real NVP) method used in normalizing flow techniques. Utilizing this neural network type allows for learning tasks on unknown Hamiltonian systems without breaking the inherent symplectic structure of the phase space.
Physics-Informed Real NVP for Satellite Power System Fault Detection
Cena, Carlo, Albertin, Umberto, Martini, Mauro, Bucci, Silvia, Chiaberge, Marcello
The unique challenges posed by the space environment, characterized by extreme conditions and limited accessibility, raise the need for robust and reliable techniques to identify and prevent satellite faults. Fault detection methods in the space sector are required to ensure mission success and to protect valuable assets. In this context, this paper proposes an Artificial Intelligence (AI) based fault detection methodology and evaluates its performance on ADAPT (Advanced Diagnostics and Prognostics Testbed), an Electrical Power System (EPS) dataset, crafted in laboratory by NASA. Our study focuses on the application of a physics-informed (PI) real-valued non-volume preserving (Real NVP) model for fault detection in space systems. The efficacy of this method is systematically compared against other AI approaches such as Gated Recurrent Unit (GRU) and Autoencoder-based techniques. Results show that our physics-informed approach outperforms existing methods of fault detection, demonstrating its suitability for addressing the unique challenges of satellite EPS sub-system faults. Furthermore, we unveil the competitive advantage of physics-informed loss in AI models to address specific space needs, namely robustness, reliability, and power constraints, crucial for space exploration and satellite missions.
Stable Training of Normalizing Flows for High-dimensional Variational Inference
Variational inference with normalizing flows (NFs) is an increasingly popular alternative to MCMC methods. In particular, NFs based on coupling layers (Real NVPs) are frequently used due to their good empirical performance. In theory, increasing the depth of normalizing flows should lead to more accurate posterior approximations. However, in practice, training deep normalizing flows for approximating high-dimensional posterior distributions is often infeasible due to the high variance of the stochastic gradients. In this work, we show that previous methods for stabilizing the variance of stochastic gradient descent can be insufficient to achieve stable training of Real NVPs. As the source of the problem, we identify that, during training, samples often exhibit unusual high values. As a remedy, we propose a combination of two methods: (1) soft-thresholding of the scale in Real NVPs, and (2) a bijective soft log transformation of the samples. We evaluate these and other previously proposed modification on several challenging target distributions, including a high-dimensional horseshoe logistic regression model. Our experiments show that with our modifications, stable training of Real NVPs for posteriors with several thousand dimensions is possible, allowing for more accurate marginal likelihood estimation via importance sampling. Moreover, we evaluate several common training techniques and architecture choices and provide practical advise for training NFs for high-dimensional variational inference.
Adaptive deep density approximation for Fokker-Planck equations
Tang, Kejun, Wan, Xiaoliang, Liao, Qifeng
In this paper we present a novel adaptive deep density approximation strategy based on KRnet (ADDA-KR) for solving the steady-state Fokker-Planck equation. It is known that this equation typically has high-dimensional spatial variables posed on unbounded domains, which limit the application of traditional grid based numerical methods. With the Knothe-Rosenblatt rearrangement, our newly proposed flow-based generative model, called KRnet, provides a family of probability density functions to serve as effective solution candidates of the Fokker-Planck equation, which have weaker dependence on dimensionality than traditional computational approaches. To result in effective stochastic collocation points for training KRnet, we develop an adaptive sampling procedure, where samples are generated iteratively using KRnet at each iteration. In addition, we give a detailed discussion of KRnet and show that it can efficiently estimate general high-dimensional density functions. We present a general mathematical framework of ADDA-KR, validate its accuracy and demonstrate its efficiency with numerical experiments.
Log-Likelihood Ratio Minimizing Flows: Towards Robust and Quantifiable Neural Distribution Alignment
Usman, Ben, Dufour, Nick, Sud, Avneesh, Saenko, Kate
Unsupervised distribution alignment has many applications in deep learning, including domain adaptation and unsupervised image-to-image translation. Most prior work on unsupervised distribution alignment relies either on minimizing simple non-parametric statistical distances such as maximum mean discrepancy, or on adversarial alignment. However, the former fails to capture the structure of complex real-world distributions, while the latter is difficult to train and does not provide any universal convergence guarantees or automatic quantitative validation procedures. In this paper we propose a new distribution alignment method based on a log-likelihood ratio statistic and normalizing flows. We show that, under certain assumptions, this combination yields a deep neural likelihood-based minimization objective that attains a known lower bound upon convergence. We experimentally verify that minimizing the resulting objective results in domain alignment that preserves the local structure of input domains.