Goto

Collaborating Authors

 reno




ReNO: Enhancing One-step Text-to-Image Models through Reward-based Noise Optimization

Neural Information Processing Systems

Text-to-Image (T2I) models have made significant advancements in recent years, but they still struggle to accurately capture intricate details specified in complex compositional prompts. While fine-tuning T2I models with reward objectives has shown promise, it suffers from reward hacking and may not generalize well to unseen prompt distributions. In this work, we propose Reward-based Noise Optimization (ReNO), a novel approach that enhances T2I models at inference by optimizing the initial noise based on the signal from one or multiple human preference reward models. Remarkably, solving this optimization problem with gradient ascent for 50 iterations yields impressive results on four different one-step models across two competitive benchmarks, T2I-CompBench and GenEval. Within a computational budget of 20-50 seconds, ReNO-enhanced one-step models consistently surpass the performance of all current open-source Text-to-Image models. Extensive user studies demonstrate that our model is preferred nearly twice as often compared to the popular SDXL model and is on par with the proprietary Stable Diffusion 3 with 8B parameters. Moreover, given the same computational resources, a ReNO-optimized one-step model outperforms widely-used open-source models such as SDXL and PixArt-alpha, highlighting the efficiency and effectiveness of ReNO in enhancing T2I model performance at inference time.




ReNO: Enhancing One-step Text-to-Image Models through Reward-based Noise Optimization

Neural Information Processing Systems

Text-to-Image (T2I) models have made significant advancements in recent years, but they still struggle to accurately capture intricate details specified in complex compositional prompts. While fine-tuning T2I models with reward objectives has shown promise, it suffers from "reward hacking" and may not generalize well to unseen prompt distributions. In this work, we propose Reward-based Noise Optimization (ReNO), a novel approach that enhances T2I models at inference by optimizing the initial noise based on the signal from one or multiple human preference reward models. Remarkably, solving this optimization problem with gradient ascent for 50 iterations yields impressive results on four different one-step models across two competitive benchmarks, T2I-CompBench and GenEval. Within a computational budget of 20-50 seconds, ReNO-enhanced one-step models consistently surpass the performance of all current open-source Text-to-Image models.


Controlling Latent Diffusion Using Latent CLIP

Becker, Jason, Wendler, Chris, Baylies, Peter, West, Robert, Wressnegger, Christian

arXiv.org Machine Learning

Instead of performing text-conditioned denoising in the image domain, latent diffusion models (LDMs) operate in latent space of a variational autoencoder (VAE), enabling more efficient processing at reduced computational costs. However, while the diffusion process has moved to the latent space, the contrastive language-image pre-training (CLIP) models, as used in many image processing tasks, still operate in pixel space. Doing so requires costly VAE-decoding of latent images before they can be processed. In this paper, we introduce Latent-CLIP, a CLIP model that operates directly in the latent space. We train Latent-CLIP on 2.7B pairs of latent images and descriptive texts, and show that it matches zero-shot classification performance of similarly sized CLIP models on both the ImageNet benchmark and a LDM-generated version of it, demonstrating its effectiveness in assessing both real and generated content. Furthermore, we construct Latent-CLIP rewards for reward-based noise optimization (ReNO) and show that they match the performance of their CLIP counterparts on GenEval and T2I-CompBench while cutting the cost of the total pipeline by 21%. Finally, we use Latent-CLIP to guide generation away from harmful content, achieving strong performance on the inappropriate image prompts (I2P) benchmark and a custom evaluation, without ever requiring the costly step of decoding intermediate images.


Representation Equivalent Neural Operators: a Framework for Alias-free Operator Learning

Bartolucci, Francesca, de Bézenac, Emmanuel, Raonić, Bogdan, Molinaro, Roberto, Mishra, Siddhartha, Alaifari, Rima

arXiv.org Artificial Intelligence

Recently, operator learning, or learning mappings between infinite-dimensional function spaces, has garnered significant attention, notably in relation to learning partial differential equations from data. Conceptually clear when outlined on paper, neural operators necessitate discretization in the transition to computer implementations. This step can compromise their integrity, often causing them to deviate from the underlying operators. This research offers a fresh take on neural operators with a framework Representation equivalent Neural Operators (ReNO) designed to address these issues. At its core is the concept of operator aliasing, which measures inconsistency between neural operators and their discrete representations. We explore this for widely-used operator learning techniques. Our findings detail how aliasing introduces errors when handling different discretizations and grids and loss of crucial continuous structures. More generally, this framework not only sheds light on existing challenges but, given its constructive and broad nature, also potentially offers tools for developing new neural operators.


Iris Automation BVLOS Approval Metropolis of Reno - Channel969

#artificialintelligence

On behalf of the Metropolis of Reno and the Reno Hearth Division (RFD), Iris Automation has been granted approval from the Federal Aviation Administration (FAA) to fly a small drone autonomously past the pilot's visible line of sight (BVLOS), with out the help of any observers or further ground-based detection gear. Testing will start over unpopulated areas earlier than shifting to city areas. The BVLOS waiver covers a rural, unpopulated space south of Reno and was submitted by Iris Automation for using its Casia X detect and keep away from resolution. "That is an thrilling venture, working with the BEYOND program and the most recent applied sciences to open the skies each for our group and the broader public," mentioned Reno Mayor Hillary Schieve. "It's a novel teaming of private and non-private pursuits to attain breakthrough operations for a variety of cost-effective, public-facing companies. Autonomous flying will profit each member of our group and drive long run financial advantages together with job creation, value financial savings and extra environment friendly companies. We intend this to be our first of many waivers as a part of this collaboration. We're proud to be main the way in which on this unbelievable area--and with a neighborhood BEYOND participant too--and excited to see our companions shifting to this subsequent step within the course of."


Mesa Air Moves Into Drone Food Delivery

WSJ.com: WSJD - Technology

Makers of new air transport technology such as drones and air taxis are joining with established aviation companies including airlines and helicopter operators to help secure backing from regulators. They face similar challenges, notably how they can be operated safely over urban areas. Proponents maintain they are cheaper and more environmentally friendly than cars and taxis, even if routine consumer deliveries and rides remain years away. "We don't know what's going to work and what's not," said Mesa Chief Executive Jonathan Ornstein. Phoenix-based Mesa plans to start with four drones made by Flirtey Inc. of Reno, Nev., with options on another 500 over the next four years to expand the service in the U.S. and to New Zealand.