Goto

Collaborating Authors

 ld3m


Unlocking Dataset Distillation with Diffusion Models

Neural Information Processing Systems

Dataset distillation seeks to condense datasets into smaller but highly representative synthetic samples. While diffusion models now lead all generative benchmarks, current distillation methods avoid them and rely instead on GANs or autoencoders, or, at best, sampling from a fixed diffusion prior. This trend arises because naive backpropagation through the long denoising chain leads to vanishing gradients, which prevents effective synthetic sample optimization. To address this limitation, we introduce Latent Dataset Distillation with Diffusion Models (LD3M), the first method to learn gradient-based distilled latents and class embeddings endto-end through a pre-trained latent diffusion model. A linearly decaying skip connection, injected from the initial noisy state into every reverse step, preserves the gradient signal across dozens of timesteps without requiring diffusion weight fine-tuning. Across multiple ImageNet subsets at 128 128and 256 256, LD3M improves downstream accuracy by up to 4.8 percentage points (1 IPC) and 4.2 points (10 IPC) over the prior state-of-the-art.


Distill the Best, Ignore the Rest: Improving Dataset Distillation with Loss-Value-Based Pruning

arXiv.org Artificial Intelligence

Dataset distillation has gained significant interest in recent years, yet existing approaches typically distill from the entire dataset, potentially including non-beneficial samples. We introduce a novel "Prune First, Distill After" framework that systematically prunes datasets via loss-based sampling prior to distillation. By leveraging pruning before classical distillation techniques and generative priors, we create a representative core-set that leads to enhanced generalization for unseen architectures - a significant challenge of current distillation methods. More specifically, our proposed framework significantly boosts distilled quality, achieving up to a 5.2 percentage points accuracy increase even with substantial dataset pruning, i.e., removing 80% of the original dataset prior to distillation. Overall, our experimental results highlight the advantages of our easy-sample prioritization and cross-architecture robustness, paving the way for more effective and high-quality dataset distillation.


Latent Dataset Distillation with Diffusion Models

arXiv.org Artificial Intelligence

Machine learning traditionally relies on increasingly larger datasets. Yet, such datasets pose major storage challenges and usually contain non-influential samples, which could be ignored during training without negatively impacting the training quality. In response, the idea of distilling a dataset into a condensed set of synthetic samples, i.e., a distilled dataset, emerged. One key aspect is the selected architecture, usually ConvNet, for linking the original and synthetic datasets. However, the final accuracy is lower if the employed model architecture differs from that used during distillation. Another challenge is the generation of high-resolution images (128x128 and higher). To address both challenges, this paper proposes Latent Dataset Distillation with Diffusion Models (LD3M) that combine diffusion in latent space with dataset distillation. Our novel diffusion process is tailored for this task and significantly improves the gradient flow for distillation. By adjusting the number of diffusion steps, LD3M also offers a convenient way of controlling the trade-off between distillation speed and dataset quality. Overall, LD3M consistently outperforms state-of-the-art methods by up to 4.8 p.p. and 4.2 p.p. for 1 and 10 images per class, respectively, and on several ImageNet subsets and high resolutions (128x128 and 256x256).