SPAE: Semantic Pyramid AutoEncoder for Multimodal Generation with Frozen LLMs

Open in new window