video enhancement
Zero-TIG: Temporal Consistency-Aware Zero-Shot Illumination-Guided Low-light Video Enhancement
Li, Yini, Anantrasirichai, Nantheera
Low-light and underwater videos suffer from poor visibility, low contrast, and high noise, necessitating enhancements in visual quality. However, existing approaches typically rely on paired ground truth, which limits their practicality and often fails to maintain temporal consistency. To overcome these obstacles, this paper introduces a novel zero-shot learning approach named Zero-TIG, leveraging the Retinex theory and optical flow techniques. The proposed network consists of an enhancement module and a temporal feedback module. The enhancement module comprises three subnetworks: low-light image denoising, illumination estimation, and reflection denoising. The temporal enhancement module ensures temporal consistency by incorporating histogram equalization, optical flow computation, and image warping to align the enhanced previous frame with the current frame, thereby maintaining continuity. Additionally, we address color distortion in underwater data by adaptively balancing RGB channels. The experimental results demonstrate that our method achieves low-light video enhancement without the need for paired training data, making it a promising and applicable method for real-world scenario enhancement.
Crisper, Clearer, and Faster: Real-Time Super-Resolution with a Recurrent Bottleneck Mixer Network (ReBotNet) - MarkTechPost
Videos have become omnipresent, from streaming our favorite movies and TV shows to participating in video conferences and calls. With the increasing use of smartphones and other capture devices, the quality of videos has risen in importance. However, due to various factors like low light, digital noise, or simply low acquisition quality, the quality of videos captured by these devices is often far from perfect. In these situations, video enhancement techniques come into play, aiming to improve resolution and visual features. Over the years, various video enhancement techniques have been developed until the arrival of complex machine learning algorithms to remove noise and improve image quality.
Learnt Deep Hyperparameter selection in Adversarial Training for compressed video enhancement with perceptual critic
Ramsook, Darren, Kokaram, Anil
Image based Deep Feature Quality Metrics (DFQMs) have been shown to better correlate with subjective perceptual scores over traditional metrics. The fundamental focus of these DFQMs is to exploit internal representations from a large scale classification network as the metric feature space. Previously, no attention has been given to the problem of identifying which layers are most perceptually relevant. In this paper we present a new method for selecting perceptually relevant layers from such a network, based on a neuroscience interpretation of layer behaviour. The selected layers are treated as a hyperparameter to the critic network in a W-GAN. The critic uses the output from these layers in the preliminary stages to extract perceptual information. A video enhancement network is trained adversarially with this critic. Our results show that the introduction of these selected features into the critic yields up to 10% (FID) and 15% (KID) performance increase against other critic networks that do not exploit the idea of optimised feature selection.
- Europe > Ukraine > Kyiv Oblast > Kyiv (0.04)
- Europe > Ireland > Leinster > County Dublin > Dublin (0.04)
- Asia > Middle East > Israel > Tel Aviv District > Tel Aviv (0.04)
SuperTran: Reference Based Video Transformer for Enhancing Low Bitrate Streams in Real Time
Khot, Tejas, Shapovalova, Nataliya, Andrei, Silviu, Mayol-Cuevas, Walterio
This work focuses on low bitrate video streaming scenarios (e.g. 50 - 200Kbps) where the video quality is severely compromised. We present a family of novel deep generative models for enhancing perceptual video quality of such streams by performing super-resolution while also removing compression artifacts. Our model, which we call SuperTran, consumes as input a single high-quality, high-resolution reference images in addition to the low-quality, low-resolution video stream. The model thus learns how to borrow or copy visual elements like textures from the reference image and fill in the remaining details from the low resolution stream in order to produce perceptually enhanced output video. The reference frame can be sent once at the start of the video session or be retrieved from a gallery. Importantly, the resulting output has substantially better detail than what has been otherwise possible with methods that only use a low resolution input such as the SuperVEGAN method. SuperTran works in real-time (up to 30 frames/sec) on the cloud alongside standard pipelines.