Appendix A Implementation details
–Neural Information Processing Systems
A.1 Model implementation details The Swap-VAE consists of an encoder and a symmetric decoder. The encoder contains three linear layers with output size [d, k, k], each but the last layer is followed by batch normalization, with eps = 0.00005 and momentum = 0.1, and the ReLU activation. The decoder contains three linear layers with output size [k, k, d] where each but the last layer contains a Batch normlization and the ReLu activation similar as above. The last layer of the decoder is followed by a SoftPlus activation with beta = 1 and threshold = 20. For synthetic experiments, all models are trained using a Nvidia Titan RTX GPU for 100,000 iterations using an Adam optimizer with a learning rate of 0.0001.
Neural Information Processing Systems
May-28-2025, 22:54:44 GMT