MultiFusion: Fusing Pre-Trained Models for Multi-Lingual, Multi-Modal Image Generation

Apr-29-2026, 13:58:00 GMT–Neural Information Processing Systems

The recent popularity of text-to-image diffusion models (DM) can largely be attributed to the intuitive interface they provide to users. The intended generation can be expressed in natural language, with the model producing faithful interpretations of text prompts. However, expressing complex or nuanced ideas in text alone can be difficult.

artificial intelligence, machine learning, natural language, (16 more...)

Neural Information Processing Systems

Apr-29-2026, 13:58:00 GMT

Conferences PDF

Add feedback

Country:
- Europe (0.28)

Genre:
- Research Report > New Finding (0.46)

Technology:
- Information Technology
  - Sensing and Signal Processing > Image Processing (1.00)
  - Artificial Intelligence
    - Vision (1.00)
    - Natural Language (1.00)
    - Machine Learning > Neural Networks (0.86)

Duplicate Docs Excel Report

Title
MultiFusion: Fusing Pre-Trained Models for Multi-Lingual, Multi-Modal Image Generation Marco Bellagente 4 Manuel Brack 2, 3 Hannah Teufel 1 Felix Friedrich

Similar Docs Excel Report more

Title	Similarity	Source
None found