real data distribution
- Oceania > Australia > Queensland (0.04)
- Asia > Middle East > Jordan (0.04)
- Asia > China > Hubei Province > Wuhan (0.04)
- Health & Medicine (0.46)
- Information Technology (0.46)
- North America > United States > New Mexico > Bernalillo County > Albuquerque (0.04)
- Europe > France > Hauts-de-France > Nord > Lille (0.04)
- Asia > China > Beijing > Beijing (0.04)
- Oceania > Australia > Queensland (0.04)
- Asia > Middle East > Jordan (0.04)
- Asia > China > Hubei Province > Wuhan (0.04)
- Health & Medicine (0.46)
- Information Technology (0.46)
A Proofs
GANs, we need to rewrite the objective functions that are easy to calculate derivatives. Proposition 2. F or any continuous and differentiable function f whose domain is X, we have: E Readers are encouraged to refer to the original proof in [57] for more details. Theorem 2. Given the optimal classifier Please see Appendix A.2 for details. Proposition 1. F or any fixed generator, given a data Theorem 3. The objective function for the generator of SSGAN-LA, given the optimal label-augmented discriminator, boils down to: min Theorem 4. At the equilibrium point of DAGAN, the optimal generator implies We first prove the first sentence in this Theorem. We then prove the second sentence in this Theorem.
- North America > United States > New Mexico > Bernalillo County > Albuquerque (0.04)
- Europe > France > Hauts-de-France > Nord > Lille (0.04)
- Asia > China > Beijing > Beijing (0.04)
Weak-to-Strong Diffusion with Reflection
Bai, Lichen, Sugiyama, Masashi, Xie, Zeke
The goal of diffusion generative models is to align the learned distribution with the real data distribution through gradient score matching. However, inherent limitations in training data quality, modeling strategies, and architectural design lead to inevitable gap between generated outputs and real data. To reduce this gap, we propose Weak-to-Strong Diffusion (W2SD), a novel framework that utilizes the estimated difference between existing weak and strong models (i.e., weak-to-strong difference) to approximate the gap between an ideal model and a strong model. By employing a reflective operation that alternates between denoising and inversion with weak-to-strong difference, we theoretically understand that W2SD steers latent variables along sampling trajectories toward regions of the real data distribution. W2SD is highly flexible and broadly applicable, enabling diverse improvements through the strategic selection of weak-to-strong model pairs (e.g., DreamShaper vs. SD1.5, good experts vs. bad experts in MoE). Extensive experiments demonstrate that W2SD significantly improves human preference, aesthetic quality, and prompt adherence, achieving SOTA performance across various modalities (e.g., image, video), architectures (e.g., UNet-based, DiT-based, MoE), and benchmarks. For example, Juggernaut-XL with W2SD can improve with the HPSv2 winning rate up to 90% over the original results. Moreover, the performance gains achieved by W2SD markedly outweigh its additional computational overhead, while the cumulative improvements from different weak-to-strong difference further solidify its practical utility and deployability.
- Asia > Middle East > Republic of Türkiye > Batman Province > Batman (0.04)
- Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.04)
- Asia > China > Hong Kong (0.04)
- Asia > China > Guangdong Province > Guangzhou (0.04)
DiffLM: Controllable Synthetic Data Generation via Diffusion Language Models
Zhou, Ying, Wang, Xinyao, Niu, Yulei, Shen, Yaojie, Tang, Lexin, Chen, Fan, He, Ben, Sun, Le, Wen, Longyin
Recent advancements in large language models (LLMs) have significantly enhanced their knowledge and generative capabilities, leading to a surge of interest in leveraging LLMs for high-quality data synthesis. However, synthetic data generation via prompting LLMs remains challenging due to LLMs' limited understanding of target data distributions and the complexity of prompt engineering, especially for structured formatted data. To address these issues, we introduce DiffLM, a controllable data synthesis framework based on variational autoencoder (VAE), which further (1) leverages diffusion models to reserve more information of original distribution and format structure in the learned latent distribution and (2) decouples the learning of target distribution knowledge from the LLM's generative objectives via a plug-and-play latent feature injection module. As we observed significant discrepancies between the VAE's latent representations and the real data distribution, the latent diffusion module is introduced into our framework to learn a fully expressive latent distribution. Evaluations on seven real-world datasets with structured formatted data (i.e., Tabular, Code and Tool data) demonstrate that DiffLM generates high-quality data, with performance on downstream tasks surpassing that of real data by 2%-7% in certain cases. The data and code will be publicly available upon completion of internal review. Data Synthesis has become an indispensable technique in current machine learning research, enabling rapid generation and modification of datasets (Bauer et al., 2024), allowing researchers to experiment with various scenarios and model architectures without the extensive processes associated with real-world data collection. Meanwhile, with the rapid advancements in large language models (LLMs), recent research in natural language processing (NLP) has increasingly focused on leveraging LLMs for synthetic data generation. Early efforts attempted to fine-tune LLMs to align with real data distributions (Keskar et al., 2019; Anaby-Tavor et al., 2020; Borisov et al., 2023). As the in-context learning capabilities of LLMs have improved, some studies have explored zero-shot or few-shot prompting of LLMs to generate synthetic data (Ye et al., 2022a; Wei et al., 2024).
- Asia (0.47)
- North America (0.29)