Goto

Collaborating Authors

 Photography


New Winxvideo AI – One-stop Video/Image Enhancer & Toolkit

PCWorld

We seem to have more video footage and still images than ever before, thanks to smartphones, GoPro cameras and the backlog of older ones collected across a lifetime. Managing all these formats, as well as making sure they look their best, can be a frightening proposition. Thankfully, Winxvideo AI is a powerful all-in-one solution that not only uses advanced Artificial Intelligence software to upgrade the quality of your content but can rescue old photos and footage too. The newly updated version 4.0 also brings huge improvements to speed, plus a special price offer, so you can save both time and money while you upgrade your photo and video library. Winxvideo AI comes with an impressive array of features that can turn tired, old, blurry videos into something far more professional.


Get 50% off a dual-camera drone that's great for beginners

Mashable

TL;DR: Become a drone photographer with this ideal model for newbies, the 4K Dual-Camera Drone for Beginners with Intelligent Obstacle Avoidance, now on sale for 50% off at just 59.99. Looking to spend more time outdoors this spring? If you're looking for a hobby that helps you take advantage of the gorgeous weather, drone photography is a great option. And right now, this 4K Dual-Camera Drone for Beginners with Intelligent Obstacle Avoidance can be yours for only 59.99 (reg. Drone photography can be a fun hobby that helps you spend more time in the great outdoors.


Google Pixel 9a review: Engaging AI features and mighty battery life give Apple's 'budget' iPhone a run for its money

Daily Mail - Science & tech

Apple released its latest'budget' phone, the 599 iPhone 16e, back in February after months of feverish anticipation. But not to be outdone, rival tech giant Google has released its own handset at an'unbeatable' price – the Pixel 9a. The device – which at 499 is 100 cheaper than Apple's equivalent – has a 6.3-inch display, two rear cameras and more than 30 hours of battery life on a single charge. It's packed with'helpful' AI tools such as Gemini – Google's chatbot which was built to rival OpenAI's ChatGPT, now on Apple phones. MailOnline tests the new Google handset, described as a more accessible alternative to the firm's flagship Pixel 9 ( 799).


Self-Supervised Image Restoration with Blurry and Noisy Pairs

Neural Information Processing Systems

When taking photos under an environment with insufficient light, the exposure time and the sensor gain usually require to be carefully chosen to obtain images with satisfying visual quality. For example, the images with high ISO usually have inescapable noise, while the long-exposure ones may be blurry due to camera shake or object motion. Existing solutions generally suggest to seek a balance between noise and blur, and learn denoising or deblurring models under either fullor self-supervision. However, the real-world training pairs are difficult to collect, and the self-supervised methods merely rely on blurry or noisy images are limited in performance. In this work, we tackle this problem by jointly leveraging the short-exposure noisy image and the long-exposure blurry image for better image restoration. Such setting is practically feasible due to that short-exposure and longexposure images can be either acquired by two individual cameras or synthesized by a long burst of images.


The Drunkard's Odometry: Estimating Camera Motion in Deforming Scenes

Neural Information Processing Systems

Estimating camera motion in deformable scenes poses a complex and open research challenge. Most existing non-rigid structure from motion techniques assume to observe also static scene parts besides deforming scene parts in order to establish an anchoring reference. However, this assumption does not hold true in certain relevant application cases such as endoscopies. Deformable odometry and SLAM pipelines, which tackle the most challenging scenario of exploratory trajectories, suffer from a lack of robustness and proper quantitative evaluation methodologies. To tackle this issue with a common benchmark, we introduce the Drunkard's Dataset, a challenging collection of synthetic data targeting visual navigation and reconstruction in deformable environments. This dataset is the first large set of exploratory camera trajectories with ground truth inside 3D scenes where every surface exhibits non-rigid deformations over time. Simulations in realistic 3D buildings lets us obtain a vast amount of data and ground truth labels, including camera poses, RGB images and depth, optical flow and normal maps at high resolution and quality. We further present a novel deformable odometry method, dubbed the Drunkard's Odometry, which decomposes optical flow estimates into rigid-body camera motion and non-rigid scene deformations. In order to validate our data, our work contains an evaluation of several baselines as well as a novel tracking error metric which does not require ground truth data.


Spatially Sparse Inference for Generative Image Editing Supplementary Material

Neural Information Processing Systems

For all models, we use block size 6 for 3 3 convolutions and block size 4 for 1 1 convolutions. We omit the element-wise operations for simplicity and follow the notations in Section 3. As the kernel sizes of the convolution in the shortcut branch and main branch are different, their reduced active block indices are different (Indices and Shortcut Indices). To reduce the tensor copying overheads in Scatter, we fuse Scatter and the following Gather into Scatter-Gather and fuse the Scatter in the shortcut, main branch and residual addition into Scatter with Block Residual. As mentioned in Section 3.2, we fuse Scatter and the following Gather into a Scatter-Gather Note that the pre-computation is cheap and only needs to be once for each resolution. Scatter weigh more in the shortcut branch.


Rethinking No-reference Image Exposure Assessment from Holism to Pixel: Models, Datasets and Benchmarks

Neural Information Processing Systems

The past decade has witnessed an increasing demand for enhancing image quality through exposure, and as a crucial prerequisite in this endeavor, Image Exposure Assessment (IEA) is now being accorded serious attention. However, IEA encounters two persistent challenges that remain unresolved over the long term: the accuracy and generalizability of No-reference IEA are inadequate for practical applications; the scope of IEA is confined to qualitative and quantitative analysis of the entire image or subimage, such as providing only a score to evaluate the exposure level, thereby lacking intuitive and precise fine-grained evaluation for complex exposure conditions. The objective of this paper is to address the persistent bottleneck challenges from three perspectives: model, dataset, and benchmark.


GS-Blur: A3D Scene-Based Dataset for Realistic Image Deblurring Dongwoo Lee 1 Joonkyu Park 1

Neural Information Processing Systems

To train a deblurring network, an appropriate dataset with paired blurry and sharp images is essential. Existing datasets collect blurry images either synthetically by aggregating consecutive sharp frames or using sophisticated camera systems to capture real blur. However, these methods offer limited diversity in blur types (blur trajectories) or require extensive human effort to reconstruct large-scale datasets, failing to fully reflect real-world blur scenarios. To address this, we propose GS-Blur, a dataset of synthesized realistic blurry images created using a novel approach. To this end, we first reconstruct 3D scenes from multi-view images using 3D Gaussian Splatting (3DGS), then render blurry images by moving the camera view along the randomly generated motion trajectories. By adopting various camera trajectories in reconstructing our GS-Blur, our dataset contains realistic and diverse types of blur, offering a large-scale dataset that generalizes well to real-world blur. Using GS-Blur with various deblurring methods, we demonstrate its ability to generalize effectively compared to previous synthetic or real blur datasets, showing significant improvements in deblurring performance.


EvolveDirector: Approaching Advanced Text-to-Image Generation with Large Vision-Language Models

Neural Information Processing Systems

Recent advancements in generation models have showcased remarkable capabilities in generating fantastic content. However, most of them are trained on proprietary high-quality data, and some models withhold their parameters and only provide accessible application programming interfaces (APIs), limiting their benefits for downstream tasks. To explore the feasibility of training a text-to-image generation model comparable to advanced models using publicly available resources, we introduce EvolveDirector. This framework interacts with advanced models through their public APIs to obtain text-image data pairs to train a base model. Our experiments with extensive data indicate that the model trained on generated data of the advanced model can approximate its generation capability.


Schedule Your Edit: A Simple yet Effective Diffusion Noise Schedule for Image Editing Jiahao Wang

Neural Information Processing Systems

Text-guided diffusion models have significantly advanced image editing, enabling high-quality and diverse modifications driven by text prompts. However, effective editing requires inverting the source image into a latent space, a process often hindered by prediction errors inherent in DDIM inversion. These errors accumulate during the diffusion process, resulting in inferior content preservation and edit fidelity, especially with conditional inputs. We address these challenges by investigating the primary contributors to error accumulation in DDIM inversion and identify the singularity problem in traditional noise schedules as a key issue. To resolve this, we introduce the Logistic Schedule, a novel noise schedule designed to eliminate singularities, improve inversion stability, and provide a better noise space for image editing. This schedule reduces noise prediction errors, enabling more faithful editing that preserves the original content of the source image. Our approach requires no additional retraining and is compatible with various existing editing methods. Experiments across eight editing tasks demonstrate the Logistic Schedule's superior performance in content preservation and edit fidelity compared to traditional noise schedules, highlighting its adaptability and effectiveness.