Appendix of " Hierarchical Vector Quantized Transformer for Multi-class Unsupervised Anomaly Detection "
–Neural Information Processing Systems
The hyperparameters β and α are set as 0.5 and 0.01 for each layer. CIF AR-10: The image size is set to 224 x 224, and the feature size is 14 x 14. The encoder and decoder layers were both set to 4. The hyperparameters β and α are set to 0.5 and 0.01 for each layer. ELBO of our variational autoencoder should include both a reconstruction likelihood and a KL term. Lower Bound (ELBO) is constant, w.r.t. the KL divergence can thus be ignored for training.
Neural Information Processing Systems
Feb-8-2026, 12:25:22 GMT
- Technology: