AITopics | dreambench-v2

Collaborating Authors

dreambench-v2

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

A Supplementary Material

Neural Information Processing SystemsFeb-12-2026, 14:23:35 GMT

These challenges have spawned the new task of'Subject-Drive Text-to-Image Generation', which is the core task of our paper aims to solve. Though the mined clusters already contain (image, alt-text) information, the alt-text's noise level is For example, the generation model believes'teapot' should contain a's in-context generation that demonstrates its skill set. Results generated from a single model . Subject (image, text) and editing key words are annotated, with detailed template in the Appendix. Such manual modification process is time-consuming.

artificial intelligence, backpack, caption, (18 more...)

Neural Information Processing Systems

Industry:

Transportation > Ground > Rail (0.31)
Transportation > Infrastructure & Services (0.30)

Technology: Information Technology > Artificial Intelligence > Vision (0.89)

Add feedback

Subject-driven Text-to-Image Generation via Apprenticeship Learning Wenhu Chen Hexiang Hu Y andong Li Nataniel Ruiz Xuhui Jia Ming-Wei Chang William W. Cohen Google Deepmind

Neural Information Processing SystemsFeb-12-2026, 14:23:31 GMT

Subject-driven image generation is related to text-driven image editing but often needs to perform more sophisticated transformations to source images (e.g., rotating the view, zooming in/out, changing the pose of

artificial intelligence, machine learning, suti, (17 more...)

Neural Information Processing Systems

Country:

Asia > Middle East > Saudi Arabia > Northern Borders Province > Arar (0.04)
Asia > Middle East > Israel (0.04)

Industry: Media (0.35)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

6091bf1542b118287db4088bc16be8d9-Supplemental-Conference.pdf

Neural Information Processing SystemsOct-8-2025, 19:01:10 GMT

artificial intelligence, caption, dreambench-v2, (18 more...)

Neural Information Processing Systems

Country: Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.05)

Industry:

Transportation > Ground > Rail (0.31)
Transportation > Infrastructure & Services (0.30)

Technology: Information Technology > Artificial Intelligence > Vision (0.49)

Add feedback

Subject-driven Text-to-Image Generation via Apprenticeship Learning Wenhu Chen Hexiang Hu Y andong Li Nataniel Ruiz Xuhui Jia Ming-Wei Chang William W. Cohen Google Deepmind

Neural Information Processing SystemsOct-8-2025, 19:01:06 GMT

artificial intelligence, machine learning, suti, (17 more...)

Neural Information Processing Systems

Country:

Asia > Middle East > Saudi Arabia > Northern Borders Province > Arar (0.04)
Asia > Middle East > Israel (0.04)

Industry: Media (0.35)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Subject-driven Text-to-Image Generation via Apprenticeship Learning

Chen, Wenhu, Hu, Hexiang, Li, Yandong, Ruiz, Nataniel, Jia, Xuhui, Chang, Ming-Wei, Cohen, William W.

arXiv.org Artificial IntelligenceOct-2-2023

Recent text-to-image generation models like DreamBooth have made remarkable progress in generating highly customized images of a target subject, by fine-tuning an ``expert model'' for a given subject from a few examples. However, this process is expensive, since a new expert model must be learned for each subject. In this paper, we present SuTI, a Subject-driven Text-to-Image generator that replaces subject-specific fine tuning with in-context learning. Given a few demonstrations of a new subject, SuTI can instantly generate novel renditions of the subject in different scenes, without any subject-specific optimization. SuTI is powered by apprenticeship learning, where a single apprentice model is learned from data generated by a massive number of subject-specific expert models. Specifically, we mine millions of image clusters from the Internet, each centered around a specific visual subject. We adopt these clusters to train a massive number of expert models, each specializing in a different subject. The apprentice model SuTI then learns to imitate the behavior of these fine-tuned experts. SuTI can generate high-quality and customized subject-specific images 20x faster than optimization-based SoTA methods. On the challenging DreamBench and DreamBench-v2, our human evaluation shows that SuTI significantly outperforms existing models like InstructPix2Pix, Textual Inversion, Imagic, Prompt2Prompt, Re-Imagen and DreamBooth, especially on the subject and text alignment aspects.

dreambooth, expert model, suti, (14 more...)

arXiv.org Artificial Intelligence

2304.00186

Country:

Asia > Middle East > Saudi Arabia > Northern Borders Province > Arar (0.04)
Asia > Middle East > Israel > Tel Aviv District > Tel Aviv (0.04)
Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.04)

Genre: Research Report (0.40)

Industry: Transportation (0.47)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.85)

Add feedback