AITopics | perceptual loss

Unleashing the Power of One-Step Diffusion based Image Super-Resolution via a Large-Scale Diffusion Discriminator

Neural Information Processing SystemsJun-22-2026, 16:02:17 GMT

Diffusion models have demonstrated excellent performance for real-world image super-resolution (Real-ISR), albeit at high computational costs. Most existing methods are trying to derive one-step diffusion models from multi-step counterparts through knowledge distillation (KD) or variational score distillation (VSD). However, these methods are limited by the capabilities of the teacher model, especially if the teacher model itself is not sufficiently strong. To tackle these issues, we propose a new One-Step Diffusion model with a larger-scale Diffusion Discriminator for SR, called D3SR. Our discriminator is able to distill noisy features from any time step of diffusion models in the latent space. In this way, our diffusion discriminator breaks through the potential limitations imposed by the presence of a teacher model.Additionally, we improve the perceptual loss with edge-aware DISTS (EA-DISTS) to enhance the model's ability to generate fine details. Our experiments demonstrate that, compared with previous diffusion-based methods requiring dozens or even hundreds of steps, our D3SR attains comparable or even superior results in both quantitative metrics and qualitative evaluations. Moreover, compared with other methods, D3SR achieves at least 3 faster inference speed and reduces parameters by at least 30%.

artificial intelligence, diffusion model, machine learning, (17 more...)

Neural Information Processing Systems

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.66)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

39d0a8908fbe6c18039ea8227f827023-Supplemental.pdf

Neural Information Processing SystemsApr-25-2026, 12:23:56 GMT

artificial intelligence, participant, sketch, (16 more...)

Neural Information Processing Systems

Genre:

Research Report > Experimental Study (0.69)
Research Report > New Finding (0.47)

Industry: Leisure & Entertainment > Games (0.47)

Technology: Information Technology > Artificial Intelligence (0.70)

Add feedback

Learning to Draw: Emergent Communication through Sketching

Neural Information Processing SystemsApr-25-2026, 12:23:49 GMT

Evidence that visual communication preceded written language and provided a basis for it goes back to prehistory, in forms such as cave and rock paintings depicting traces of our distant ancestors. Emergent communication research has sought to explore how agents can learn to communicate in order to collaboratively solve tasks. Existing research has focused on language, with a learned communication channel transmitting sequences of discrete tokens between the agents. In this work, we explore a visual communication channel between agents that are allowed to draw with simple strokes. Our agents are parameterised by deep neural networks, and the drawing procedure is differentiable, allowing for end-to-end training. In the framework of a referential communication game, we demonstrate that agents can not only successfully learn to communicate by drawing, but with appropriate inductive biases, can do so in a fashion that humans can interpret. We hope to encourage future research to consider visual communication as a more flexible and directly interpretable alternative of training collaborative agents.

artificial intelligence, deep learning, machine learning, (18 more...)

Neural Information Processing Systems

Country:

Europe (0.46)
North America > United States (0.28)

Genre: Research Report > New Finding (1.00)

Industry: Leisure & Entertainment > Games (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.88)

Add feedback

FINALLY: fast and universal speech enhancement with studio-like quality

Neural Information Processing SystemsMar-17-2026, 19:56:09 GMT

In this paper, we address the challenge of speech enhancement in real-world recordings, which often contain various forms of distortion, such as background noise, reverberation, and microphone artifacts.We revisit the use of Generative Adversarial Networks (GANs) for speech enhancement and theoretically show that GANs are naturally inclined to seek the point of maximum density within the conditional clean speech distribution, which, as we argue, is essential for speech enhancement task.We study various feature extractors for perceptual loss to facilitate the stability of adversarial training, developing a methodology for probing the structure of the feature space.This leads us to integrate WavLM-based perceptual loss into MS-STFT adversarial training pipeline, creating an effective and stable training procedure for the speech enhancement model.The resulting speech enhancement model, which we refer to as FINALLY, builds upon the HiFi++ architecture, augmented with a WavLM encoder and a novel training pipeline.Empirical results on various datasets confirm our model's ability to produce clear, high-quality speech at 48 kHz, achieving state-of-the-art performance in the field of speech enhancement.

artificial intelligence, machine learning, proceedings, (6 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Input Similarity from the Neural Network Perspective

Guillaume Charpiat, Nicolas Girard, Loris Felardos, Yuliya Tarabalka

Neural Information Processing SystemsFeb-14-2026, 01:47:32 GMT

Neural Information Processing Systems http://nips.cc/

dataset, neural network, similarity, (15 more...)

Neural Information Processing Systems

Country:

North America > Canada (0.04)
Europe > France > Provence-Alpes-Côte d'Azur (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Unsupervised Learning of Object Landmarks through Conditional Image Generation

Tomas Jakab, Ankush Gupta, Hakan Bilen, Andrea Vedaldi

Neural Information Processing SystemsFeb-12-2026, 09:42:51 GMT

Neural Information Processing Systems http://nips.cc/

keypoint, landmark, proc, (17 more...)

Neural Information Processing Systems

Country:

North America > Canada > Quebec > Montreal (0.04)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

f95ec3de395b4bce25b39ef6138da871-Supplemental.pdf

Neural Information Processing SystemsFeb-12-2026, 00:20:54 GMT

ners, novel-view synthesis, synthesis, (16 more...)

Neural Information Processing Systems

Country: Asia > Japan > Honshū > Chūbu > Ishikawa Prefecture > Kanazawa (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Controllable Text-to-Image Generation

Bowen Li, Xiaojuan Qi, Thomas Lukasiewicz, Philip Torr

Neural Information Processing SystemsFeb-11-2026, 15:36:35 GMT

Also, a word-level discriminator is proposed to providefine-grained supervisory feedback bycorrelating wordswithimageregions, facilitating training an effective generator which is able to manipulate specific visual attributes without affecting the generation of other content. Furthermore, perceptual loss is adopted to reduce the randomness involved in the image generation, andtoencourage thegenerator tomanipulate specific attributesrequired inthemodified text.

artificial intelligence, arxivpreprintarxiv, machine learning, (18 more...)

Neural Information Processing Systems

Country: North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)

Technology: