mumu
MUMU: Bootstrapping Multimodal Image Generation from Text-to-Image Data
Berman, William, Peysakhovich, Alexander
We train a model to generate images from multimodal prompts of interleaved text and images such as "a
- Information Technology > Sensing and Signal Processing > Image Processing (1.00)
- Information Technology > Artificial Intelligence > Vision (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)