AITopics | sdxl-turbo

Collaborating Authors

sdxl-turbo

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

StepbaQ: Stepping backward as Correction for Quantized Diffusion Models

Neural Information Processing SystemsFeb-15-2026, 05:32:04 GMT

Quantization of diffusion models has attracted considerable attention due to its potential to enable various applications on resource-constrained mobile devices.

artificial intelligence, machine learning, natural language, (20 more...)

Neural Information Processing Systems

Country: Europe > Switzerland > Zürich > Zürich (0.14)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (0.93)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback

StepbaQ: Stepping backward as Correction for Quantized Diffusion Models

Neural Information Processing SystemsOct-10-2025, 04:13:41 GMT

Quantization of diffusion models has attracted considerable attention due to its potential to enable various applications on resource-constrained mobile devices.

diffusion model, quantization error, stepbaq, (16 more...)

Neural Information Processing Systems

Country: Europe > Switzerland > Zürich > Zürich (0.14)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (0.93)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback

Controlling Latent Diffusion Using Latent CLIP

Becker, Jason, Wendler, Chris, Baylies, Peter, West, Robert, Wressnegger, Christian

arXiv.org Machine LearningMar-11-2025

Instead of performing text-conditioned denoising in the image domain, latent diffusion models (LDMs) operate in latent space of a variational autoencoder (VAE), enabling more efficient processing at reduced computational costs. However, while the diffusion process has moved to the latent space, the contrastive language-image pre-training (CLIP) models, as used in many image processing tasks, still operate in pixel space. Doing so requires costly VAE-decoding of latent images before they can be processed. In this paper, we introduce Latent-CLIP, a CLIP model that operates directly in the latent space. We train Latent-CLIP on 2.7B pairs of latent images and descriptive texts, and show that it matches zero-shot classification performance of similarly sized CLIP models on both the ImageNet benchmark and a LDM-generated version of it, demonstrating its effectiveness in assessing both real and generated content. Furthermore, we construct Latent-CLIP rewards for reward-based noise optimization (ReNO) and show that they match the performance of their CLIP counterparts on GenEval and T2I-CompBench while cutting the cost of the total pipeline by 21%. Finally, we use Latent-CLIP to guide generation away from harmful content, achieving strong performance on the inappropriate image prompts (I2P) benchmark and a custom evaluation, without ever requiring the costly step of decoding intermediate images.

latent image, latent-clip, sdxl-turbo, (15 more...)

arXiv.org Machine Learning

2503.08455

Country:

North America > United States > Indiana > Marion County > Lawrence (0.04)
Europe > Germany > Baden-Württemberg > Karlsruhe Region > Karlsruhe (0.04)

Genre: Research Report > New Finding (0.46)

Industry: Health & Medicine (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

Empowering Backbone Models for Visual Text Generation with Input Granularity Control and Glyph-Aware Training

Li, Wenbo, Li, Guohao, Lan, Zhibin, Xu, Xue, Zhuang, Wanru, Liu, Jiachen, Xiao, Xinyan, Su, Jinsong

arXiv.org Artificial IntelligenceOct-6-2024

Diffusion-based text-to-image models have demonstrated impressive achievements in diversity and aesthetics but struggle to generate images with legible visual texts. Existing backbone models have limitations such as misspelling, failing to generate texts, and lack of support for Chinese text, but their development shows promising potential. In this paper, we propose a series of methods, aiming to empower backbone models to generate visual texts in English and Chinese. We first conduct a preliminary study revealing that Byte Pair Encoding (BPE) tokenization and the insufficient learning of cross-attention modules restrict the performance of the backbone models. Based on these observations, we make the following improvements: (1) We design a mixed granularity input strategy to provide more suitable text representations; (2) We propose to augment the conventional training objective with three glyph-aware training losses, which enhance the learning of cross-attention modules and encourage the model to focus on visual texts. Through experiments, we demonstrate that our methods can effectively empower backbone models to generate semantic relevant, aesthetically appealing, and accurate visual text images, while maintaining their fundamental image generation quality.

backbone model, bpe tokenization, visual text, (14 more...)

arXiv.org Artificial Intelligence

2410.04439

Country:

Asia > China > Fujian Province > Xiamen (0.04)
South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
Europe > Italy > Calabria > Catanzaro Province > Catanzaro (0.04)
(2 more...)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.89)
Information Technology > Sensing and Signal Processing > Image Processing (0.89)

Add feedback