Multimodal Representation Alignment for Image Generation: Text-Image Interleaved Control Is Easier Than You Think

Open in new window