Goto

Collaborating Authors

 high fidelity


Saddle-Free Guidance: Improved On-Manifold Sampling without Labels or Additional Training

Yeats, Eric, Hannan, Darryl, Fearn, Wilson, Doster, Timothy, Kvinge, Henry, Mahan, Scott

arXiv.org Machine Learning

Score-based generative models require guidance in order to generate plausible, on-manifold samples. The most popular guidance method, Classifier-Free Guidance (CFG), is only applicable in settings with labeled data and requires training an additional unconditional score-based model. More recently, Auto-Guidance adopts a smaller, less capable version of the original model to guide generation. While each method effectively promotes the fidelity of generated data, each requires labeled data or the training of additional models, making it challenging to guide score-based models when (labeled) training data are not available or training new models is not feasible. We make the surprising discovery that the positive curvature of log density estimates in saddle regions provides strong guidance for score-based models. Motivated by this, we develop saddle-free guidance (SFG) which maintains estimates of maximal positive curvature of the log density to guide individual score-based models. SFG has the same computational cost of classifier-free guidance, does not require additional training, and works with off-the-shelf diffusion and flow matching models. Our experiments indicate that SFG achieves state-of-the-art FID and FD-DINOv2 metrics in single-model unconditional ImageNet-512 generation. When SFG is combined with Auto-Guidance, its unconditional samples achieve general state-of-the-art in FD-DINOv2 score. Our experiments with FLUX.1-dev and Stable Diffusion v3.5 indicate that SFG boosts the diversity of output images compared to CFG while maintaining excellent prompt adherence and image fidelity.




Exploring Quantum Control Landscape and Solution Space Complexity through Dimensionality Reduction & Optimization Algorithms

Fentaw, Haftu W., Campbell, Steve, Caton, Simon

arXiv.org Artificial Intelligence

Understanding the quantum control landscape (QCL) is important for designing effective quantum control strategies. In this study, we analyze the QCL for a single two-level quantum system (qubit) using various control strategies. We employ Principal Component Analysis (PCA), to visualize and analyze the QCL for higher dimensional control parameters. Our results indicate that dimensionality reduction techniques such as PCA, can play an important role in understanding the complex nature of quantum control in higher dimensions. Evaluations of traditional control techniques and machine learning algorithms reveal that Genetic Algorithms (GA) outperform Stochastic Gradient Descent (SGD), while Q-learning (QL) shows great promise compared to Deep Q-Networks (DQN) and Proximal Policy Optimization (PPO). Additionally, our experiments highlight the importance of reward function design in DQN and PPO demonstrating that using immediate reward results in improved performance rather than delayed rewards for systems with short time steps. A study of solution space complexity was conducted by using Cluster Density Index (CDI) as a key metric for analyzing the density of optimal solutions in the landscape. The CDI reflects cluster quality and helps determine whether a given algorithm generates regions of high fidelity or not. Our results provide insights into effective quantum control strategies, emphasizing the significance of parameter selection and algorithm optimization.


Review for NeurIPS paper: Compositional Visual Generation with Energy Based Models

Neural Information Processing Systems

Weaknesses: * The visual quality/fidelity of the generated images is quite low. Making sure that the visual fidelity on common metrics such as FID matches or is at least close enough to GAN models will be useful to validate that the approach supports high fidelity (as otherwise it may be the case that it achieves compositionality at the expense of lower potential for fine details or high fidelity, as is the case in e.g. Given that there have been many works that explore combinations of properties for CelebA images with GANs, showing that the proposed approach can compete with them is especially important. Showing learning plots as well compared to other types of generative models will be useful. However, note that the motivation and goals of the model -- to achieve compositional generation through logical combination of concepts learned through data subsets, is similar to a prior VAE paper.


Wax Heads, the record-shop video game that channels High Fidelity

The Guardian

Every time I go through a breakup, I'm compelled to rewatch the noughties classic High Fidelity, in which OG softboi John Cusack mournfully chronicles a "top 10 list" of his all-time worst breakups, soundtracked by the albums that accompanied them. A sanctuary for a hurting Cusack, this battered boutique becomes a refuge for Chicago's other lost souls, giving its perennially hungover proprietor and a gaggle of local music nerds a place to lick their wounds. It's this kind of DIY community spirit that spills out of the screen as I dive into Wax Heads, a narrative game about managing a struggling record shop. A self-described "cosy-punk life sim", this colourful comic-book-esque caper channels everything great about High Fidelity, as the player learns the ropes during a chaotic first shift at the fictional Repeater Records. As I design posters for a local punk gig between slacking off on a legally distinct knock-off of a Tamagotchi, it's clear that Wax Heads sees the local vinyl shop as a musical mecca, a place where you spin tunes and befriend its weird and wonderful customers.


Datasets and Benchmarks for Nanophotonic Structure and Parametric Design Simulations

Kim, Jungtaek, Li, Mingxuan, Hinder, Oliver, Leu, Paul W.

arXiv.org Machine Learning

Nanophotonic structures have versatile applications including solar cells, anti-reflective coatings, electromagnetic interference shielding, optical filters, and light emitting diodes. To design and understand these nanophotonic structures, electrodynamic simulations are essential. These simulations enable us to model electromagnetic fields over time and calculate optical properties. In this work, we introduce frameworks and benchmarks to evaluate nanophotonic structures in the context of parametric structure design problems. The benchmarks are instrumental in assessing the performance of optimization algorithms and identifying an optimal structure based on target optical properties. Moreover, we explore the impact of varying grid sizes in electrodynamic simulations, shedding light on how evaluation fidelity can be strategically leveraged in enhancing structure designs.


Perception-Distortion Trade-off in the SR Space Spanned by Flow Models

Korkmaz, Cansu, Tekalp, A. Murat, Dogan, Zafer, Erdem, Erkut, Erdem, Aykut

arXiv.org Artificial Intelligence

Flow-based generative super-resolution (SR) models learn to produce a diverse set of feasible SR solutions, called the SR space. Diversity of SR solutions increases with the temperature ($\tau$) of latent variables, which introduces random variations of texture among sample solutions, resulting in visual artifacts and low fidelity. In this paper, we present a simple but effective image ensembling/fusion approach to obtain a single SR image eliminating random artifacts and improving fidelity without significantly compromising perceptual quality. We achieve this by benefiting from a diverse set of feasible photo-realistic solutions in the SR space spanned by flow models. We propose different image ensembling and fusion strategies which offer multiple paths to move sample solutions in the SR space to more desired destinations in the perception-distortion plane in a controllable manner depending on the fidelity vs. perceptual quality requirements of the task at hand. Experimental results demonstrate that our image ensembling/fusion strategy achieves more promising perception-distortion trade-off compared to sample SR images produced by flow models and adversarially trained models in terms of both quantitative metrics and visual quality.


Beyond screens

USATODAY - Tech Top Stories

You might not know this, but the metaverse is coming. Where you can live a life that expands on reality and can approach hyperrealism. I just took my next big step toward embracing it – and soon you may, too. Within a matter of minutes, I was reborn as a holographic avatar – a digital version of me – with the help of the Avatar Dimension technicians in northern Virginia, just west of the nation's capital. My virtual doppelgänger is ready to embark on digital adventures, be inserted into a video game, a movie, or virtual reality. And it's ready for the metaverse, the persistent alternate reality in cyberspace author Neal Stephenson envisioned in his 1992 science fiction novel "Snow Crash."


From Minecraft to Zoom calls, we've all spent much of the pandemic on our screens. But are we ready for the metaverse?

USATODAY - Tech Top Stories

You might not know this, but the metaverse is coming. Where you can live a life that expands on reality and can approach hyperrealism. I just took my next big step toward embracing it – and soon you may, too. Within a matter of minutes, I was reborn as a holographic avatar – a digital version of me – with the help of the Avatar Dimension technicians in northern Virginia, just west of the nation's capital. My virtual doppelgänger is ready to embark on digital adventures, be inserted into a video game, a movie or virtual reality. And it's ready for the metaverse, the persistent alternate reality in cyberspace author Neal Stephenson envisioned in his 1992 science fiction novel "Snow Crash."