Unconditional Image-Text Pair Generation with Multimodal Cross Quantizer

Open in new window