AITopics

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.93)

Industry: Leisure & Entertainment > Sports (0.46)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Neural Information Processing SystemsJun-22-2026, 23:38:20 GMT

Shortcutting Pre-trained Flow Matching Diffusion Models is Almost Free Lunch

We present an ultra-efficient post-training method for shortcutting large-scale pretrained flow matching diffusion models into efficient few-step samplers, enabled by novel velocity field self-distillation. While shortcutting in flow matching, originally introduced by shortcut models, offers flexible trajectory-skipping capabilities, it requires a specialized step-size embedding incompatible with existing models unless retraining from scratch--a process nearly as costly as pretraining itself. Our key contribution is thus imparting a more aggressive shortcut mechanism to standard flow matching models (e.g., Flux), leveraging a unique distillation principle that obviates the need for step-size embedding.

artificial intelligence, distillation, machine learning, (18 more...)

Genre: Research Report > Experimental Study (1.00)

Industry: Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)

Neural Information Processing SystemsJun-15-2026, 13:14:56 GMT

Alchemist: Turning Public Text-to-Image Data into Generative Gold

Pre-training equips text-to-image (T2I) models with broad world knowledge, but this alone is often insufficient to achieve high aesthetic quality and alignment. Consequently, supervised fine-tuning (SFT) is crucial for further refinement. However, its effectiveness highly depends on the quality of the fine-tuning dataset. Existing public SFT datasets frequently target narrow domains (e.g., anime or specific art styles), and the creation of high-quality, general-purpose SFT datasets remains a significant challenge. Current curation methods are often costly and struggle to identify truly impactful samples.

large language model, machine learning, natural language, (20 more...)

Country: Asia (0.28)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Information Technology (0.67)
Transportation (0.46)
Leisure & Entertainment > Sports (0.46)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.93)

Neural Information Processing SystemsJun-10-2026, 15:03:39 GMT

Fuse2Match: Training-Free Fusion of Flow, Diffusion, and Contrastive Models for Zero-Shot Semantic Matching

Recent work shows that features from Stable Diffusion (SD) and contrastively pretrained models like DINO can be directly used for zero-shot semantic correspondence via naive feature concatenation. In this paper, we explore the stronger potential of Stable Diffusion 3 (SD3), a rectified flow-based model with a multimodal transformer backbone (MM-DiT). We show that semantic signals in SD3 are scattered across multiple timesteps and transformer layers, and propose a multi-level fusion scheme to extract discriminative features. Moreover, we identify that naive fusion across models suffers from inconsistent distributions, thus leading to suboptimal performance. To address this, we propose a simple yet effective confidence-aware feature fusion strategy that re-weights each model's contribution based on prediction confidence scores derived from their matching uncertainties. Notably, this fusion approach is not only training-free but also enables per-pixel adaptive integration of heterogeneous features. The resulting representation, Fuse2Match, significantly outperforms strong baselines on SPair-71k, PF-Pascal, and PSC6K, validating the benefit of combining SD3, SD, and DINO through our proposed confidence-aware feature fusion.

large language model, machine learning, natural language, (8 more...)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (0.63)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.30)

Neural Information Processing SystemsFeb-9-2026, 06:33:38 GMT

Appendix for Softmax Deep Double Deterministic Policy Gradients Ling Pan

We demonstrate the smoothing effect of SD3 on the optimization landscape in this section, where experimental setup is the same as in Section 4.1 in the text for the comparative study of SD2 and Experimental details can be found in Section B.2. The performance comparison of SD3 and TD3 is shown in Figure 1(a), where SD3 significantly outperforms TD3. So far, we have demonstrated the smoothing effect of SD3 over TD3. Hyperparameters of DDPG and SD2 are summarized in Table 1. Assume that the actor is a local maximizer with respect to the critic.

artificial intelligence, machine learning, sd3, (16 more...)

Country:

North America > Canada (0.04)
Asia > Middle East > Jordan (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.61)

Neural Information Processing SystemsFeb-9-2026, 06:33:31 GMT

884d247c6f65a96a7da4d1105d584ddd-Paper.pdf

DDPG [24]extends Q-learning to continuous control based on the Deterministic Policy Gradient [31] algorithm, which learns a deterministic policyπ(s;φ) parameterized byφto maximize the Q-function to approximate themaxoperator.

artificial intelligence, machine learning, reinforcement learning, (20 more...)

Country:

Asia > Middle East > Jordan (0.05)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.72)

arXiv.org Artificial IntelligenceDec-4-2025

Score Distillation of Flow Matching Models

Zhou, Mingyuan, Gu, Yi, Zheng, Huangjie, Song, Liangchen, He, Guande, Zhang, Yizhe, Hu, Wenze, Yang, Yinfei

Diffusion models achieve high-quality image generation but are limited by slow iterative sampling. Distillation methods alleviate this by enabling one- or few-step generation. Flow matching, originally introduced as a distinct framework, has since been shown to be theoretically equivalent to diffusion under Gaussian assumptions, raising the question of whether distillation techniques such as score distillation transfer directly. We provide a simple derivation -- based on Bayes' rule and conditional expectations -- that unifies Gaussian diffusion and flow matching without relying on ODE/SDE formulations. Building on this view, we extend Score identity Distillation (SiD) to pretrained text-to-image flow-matching models, including SANA, SD3-Medium, SD3.5-Medium/Large, and FLUX.1-dev, all with DiT backbones. Experiments show that, with only modest flow-matching- and DiT-specific adjustments, SiD works out of the box across these models, in both data-free and data-aided settings, without requiring teacher finetuning or architectural changes. This provides the first systematic evidence that score distillation applies broadly to text-to-image flow matching models, resolving prior concerns about stability and soundness and unifying acceleration techniques across diffusion- and flow-based generators. A project page is available at https://yigu1008.github.io/SiD-DiT.

artificial intelligence, distillation, machine learning, (12 more...)

2509.25127

Country:

North America > United States (0.46)
Europe > Austria (0.28)

Genre: Research Report (0.64)

Industry: Media (0.67)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.48)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.34)

arXiv.org Artificial IntelligenceNov-21-2025

Parameter-aware high-fidelity microstructure generation using stable diffusion

Phan, Hoang Cuong, Tran, Minh Tien, Lee, Chihun, Kim, Hoheok, Oh, Sehyeok, Kim, Dong-Kyu, Lee, Ho Won

Synthesizing realistic microstructure images conditioned on processing parameters is crucial for understanding process-structure relationships in materials design. However, this task remains challenging due to limited training micrographs and the continuous nature of processing variables. To overcome these challenges, we present a novel process-aware generative modeling approach based on Stable Diffusion 3.5 Large (SD3.5-Large), a state-of-the-art text-to-image diffusion model adapted for microstructure generation. Our method introduces numeric-aware embeddings that encode continuous variables (annealing temperature, time, and magnification) directly into the model's conditioning, enabling controlled image generation under specified process conditions and capturing process-driven microstructural variations. To address data scarcity and computational constraints, we fine-tune only a small fraction of the model's weights via DreamBooth and Low-Rank Adaptation (LoRA), efficiently transferring the pre-trained model to the materials domain. We validate realism using a semantic segmentation model based on a fine-tuned U-Net with a VGG16 encoder on 24 labeled micrographs. It achieves 97.1% accuracy and 85.7% mean IoU, outperforming previous methods. Quantitative analyses using physical descriptors and spatial statistics show strong agreement between synthetic and real microstructures. Specifically, two-point correlation and lineal-path errors remain below 2.1% and 0.6%, respectively. Our method represents the first adaptation of SD3.5-Large for process-aware microstructure generation, offering a scalable approach for data-driven materials design.

artificial intelligence, deep learning, machine learning, (16 more...)

doi: 10.1016/j.aei.2025.104080

2507.00459

Country: Asia > South Korea (0.28)

Genre: Research Report > New Finding (0.46)

Industry: Materials > Metals & Mining > Steel (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

arXiv.org Artificial IntelligenceOct-28-2025

Flow-GRPO: Training Flow Matching Models via Online RL

Liu, Jie, Liu, Gongye, Liang, Jiajun, Li, Yangguang, Liu, Jiaheng, Wang, Xintao, Wan, Pengfei, Zhang, Di, Ouyang, Wanli

We propose Flow-GRPO, the first method to integrate online policy gradient reinforcement learning (RL) into flow matching models. Our approach uses two key strategies: (1) an ODE-to-SDE conversion that transforms a deterministic Ordinary Differential Equation (ODE) into an equivalent Stochastic Differential Equation (SDE) that matches the original model's marginal distribution at all timesteps, enabling statistical sampling for RL exploration; and (2) a Denoising Reduction strategy that reduces training denoising steps while retaining the original number of inference steps, significantly improving sampling efficiency without sacrificing performance. Empirically, Flow-GRPO is effective across multiple text-to-image tasks. For compositional generation, RL-tuned SD3.5-M generates nearly perfect object counts, spatial relations, and fine-grained attributes, increasing GenEval accuracy from $63\%$ to $95\%$. In visual text rendering, accuracy improves from $59\%$ to $92\%$, greatly enhancing text generation. Flow-GRPO also achieves substantial gains in human preference alignment. Notably, very little reward hacking occurred, meaning rewards did not increase at the cost of appreciable image quality or diversity degradation.

arxiv preprint arxiv, large language model, machine learning, (18 more...)

2505.0547

Country: Asia > China (0.28)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

arXiv.org Artificial IntelligenceOct-22-2025

Shortcutting Pre-trained Flow Matching Diffusion Models is Almost Free Lunch

Cai, Xu, Wu, Yang, Chen, Qianli, Wu, Haoran, Xiang, Lichuan, Wen, Hongkai

We present an ultra-efficient post-training method for shortcutting large-scale pre-trained flow matching diffusion models into efficient few-step samplers, enabled by novel velocity field self-distillation. While shortcutting in flow matching, originally introduced by shortcut models, offers flexible trajectory-skipping capabilities, it requires a specialized step-size embedding incompatible with existing models unless retraining from scratch$\unicode{x2013}$a process nearly as costly as pretraining itself. Our key contribution is thus imparting a more aggressive shortcut mechanism to standard flow matching models (e.g., Flux), leveraging a unique distillation principle that obviates the need for step-size embedding. Working on the velocity field rather than sample space and learning rapidly from self-guided distillation in an online manner, our approach trains efficiently, e.g., producing a 3-step Flux less than one A100 day. Beyond distillation, our method can be incorporated into the pretraining stage itself, yielding models that inherently learn efficient, few-step flows without compromising quality. This capability also enables, to our knowledge, the first few-shot distillation method (e.g., 10 text-image pairs) for dozen-billion-parameter diffusion models, delivering state-of-the-art performance at almost free cost.

artificial intelligence, distillation, machine learning, (18 more...)

2510.17858

Country: Europe > Switzerland (0.28)

Genre: Research Report (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)