Goto

Collaborating Authors

 super-resolution


2D Representation for Unguided Single-View 3D Super-Resolution in Real-Time

Mas, Ignasi, Huerta, Ivan, Morros, Ramon, Ruiz-Hidalgo, Javier

arXiv.org Artificial Intelligence

We introduce 2Dto3D-SR, a versatile framework for real-time single-view 3D super-resolution that eliminates the need for high-resolution RGB guidance. Our framework encodes 3D data from a single viewpoint into a structured 2D representation, enabling the direct application of existing 2D image super-resolution architectures. We utilize the Projected Normalized Coordinate Code (PNCC) to represent 3D geometry from a visible surface as a regular image, thereby circumventing the complexities of 3D point-based or RGB-guided methods. This design supports lightweight and fast models adaptable to various deployment environments. We evaluate 2Dto3D-SR with two implementations: one using Swin Transformers for high accuracy, and another using Vision Mamba for high efficiency. Experiments show the Swin Transformer model achieves state-of-the-art accuracy on standard benchmarks, while the Vision Mamba model delivers competitive results at real-time speeds. This establishes our geometry-guided pipeline as a surprisingly simple yet viable and practical solution for real-world scenarios, especially where high-resolution RGB data is inaccessible.


Super-Resolution Off the Grid

Qingqing Huang, Sham M. Kakade

Neural Information Processing Systems

Super-resolution is the problem of recovering a superposition of point sources using bandlimited measurements, which may be corrupted with noise. This signal processing problem arises in numerous imaging problems, ranging from astronomy to biology to spectroscopy, where it is common to take (coarse) Fourier measurements of an object. Of particular interest is in obtaining estimation procedures which are robust to noise, with the following desirable statistical and computational properties: we seek to use coarse Fourier measurements (bounded by some cutoff frequency); we hope to take a (quantifiably) small number of measurements; we desire our algorithm to run quickly. Suppose we have k point sources in d dimensions, where the points are separated by at least from each other (in Euclidean distance). This work provides an algorithm with the following favorable guarantees: The algorithm uses Fourier measurements, whose frequencies are bounded by O (1 /) (up to log factors).


Ultra-High-Definition Reference-Based Landmark Image Super-Resolution with Generative Diffusion Prior

Shi, Zhenning, Yan, Zizheng, Yu, Yuhang, Xue, Clara, Zhuang, Jingyu, Zhang, Qi, Chen, Jinwei, Li, Tao, Fan, Qingnan

arXiv.org Artificial Intelligence

Reference-based Image Super-Resolution (RefSR) aims to restore a low-resolution (LR) image by utilizing the semantic and texture information from an additional reference high-resolution (reference HR) image. Existing diffusion-based RefSR methods are typically built upon ControlNet, which struggles to effectively align the information between the LR image and the reference HR image. Moreover, current RefSR datasets suffer from limited resolution and poor image quality, resulting in the reference images lacking sufficient fine-grained details to support high-quality restoration. To overcome the limitations above, we propose TriFlowSR, a novel framework that explicitly achieves pattern matching between the LR image and the reference HR image. Meanwhile, we introduce Landmark-4K, the first RefSR dataset for Ultra-High-Definition (UHD) landmark scenarios. Considering the UHD scenarios with real-world degradation, in TriFlowSR, we design a Reference Matching Strategy to effectively match the LR image with the reference HR image. Experimental results show that our approach can better utilize the semantic and texture information of the reference HR image compared to previous methods. To the best of our knowledge, we propose the first diffusion-based RefSR pipeline for ultra-high definition landmark scenarios under real-world degradation.


AnyTSR: Any-Scale Thermal Super-Resolution for UAV

Li, Mengyuan, Fu, Changhong, Lu, Ziyu, Zhang, Zijie, Zuo, Haobo, Yao, Liangliang

arXiv.org Artificial Intelligence

-- Thermal imaging can greatly enhance the application of intelligent unmanned aerial vehicles (UA V) in challenging environments. However, the inherent low resolution of thermal sensors leads to insufficient details and blurred boundaries. Super-resolution (SR) offers a promising solution to address this issue, while most existing SR methods are designed for fixed-scale SR. They are computationally expensive and inflexible in practical applications. T o address above issues, this work proposes a novel any-scale thermal SR method (AnyTSR) for UA V within a single model. Specifically, a new image encoder is proposed to explicitly assign specific feature code to enable more accurate and flexible representation. Additionally, by effectively embedding coordinate offset information into the local feature ensemble, an innovative any-scale upsampler is proposed to better understand spatial relationships and reduce artifacts. Moreover, a novel dataset (UA V-TSR), covering both land and water scenes, is constructed for thermal SR tasks. Experimental results demonstrate that the proposed method consistently outperforms state-of-the-art methods across all scaling factors as well as generates more accurate and detailed high-resolution images.


Learning to See More: UAS-Guided Super-Resolution of Satellite Imagery for Precision Agriculture

Masrur, Arif, Olsen, Peder A., Adler, Paul R., Jackson, Carlan, Myers, Matthew W., Sedghi, Nathan, Weil, Ray R.

arXiv.org Artificial Intelligence

Unmanned Aircraft Systems (UAS) and satellites are key data sources for precision agriculture, yet each presents trade-offs. Satellite data offer broad spatial, temporal, and spectral coverage but lack the resolution needed for many precision farming applications, while UAS provide high spatial detail but are limited by coverage and cost, especially for hyperspectral data. This study presents a novel framework that fuses satellite and UAS imagery using super-resolution methods. By integrating data across spatial, spectral, and temporal domains, we leverage the strengths of both platforms cost-effectively. We use estimation of cover crop biomass and nitrogen (N) as a case study to evaluate our approach. By spectrally extending UAS RGB data to the vegetation red edge and near-infrared regions, we generate high-resolution Sentinel-2 imagery and improve biomass and N estimation accuracy by 18% and 31%, respectively. Our results show that UAS data need only be collected from a subset of fields and time points. Farmers can then 1) enhance the spectral detail of UAS RGB imagery; 2) increase the spatial resolution by using satellite data; and 3) extend these enhancements spatially and across the growing season at the frequency of the satellite flights. Our SRCNN-based spectral extension model shows considerable promise for model transferability over other cropping systems in the Upper and Lower Chesapeake Bay regions. Additionally, it remains effective even when cloud-free satellite data are unavailable, relying solely on the UAS RGB input. The spatial extension model produces better biomass and N predictions than models built on raw UAS RGB images. Once trained with targeted UAS RGB data, the spatial extension model allows farmers to stop repeated UAS flights. While we introduce super-resolution advances, the core contribution is a lightweight and scalable system for affordable on-farm use.


Enhancing Image Resolution of Solar Magnetograms: A Latent Diffusion Model Approach

Ramunno, Francesco Pio, Massa, Paolo, Kinakh, Vitaliy, Panos, Brandon, Csillaghy, André, Voloshynovskiy, Slava

arXiv.org Artificial Intelligence

The spatial properties of the solar magnetic field are crucial to decoding the physical processes in the solar interior and their interplanetary effects. However, observations from older instruments, such as the Michelson Doppler Imager (MDI), have limited spatial or temporal resolution, which hinders the ability to study small-scale solar features in detail. Super resolving these older datasets is essential for uniform analysis across different solar cycles, enabling better characterization of solar flares, active regions, and magnetic network dynamics. In this work, we introduce a novel diffusion model approach for Super-Resolution and we apply it to MDI magnetograms to match the higher-resolution capabilities of the Helioseismic and Magnetic Imager (HMI). By training a Latent Diffusion Model (LDM) with residuals on downscaled HMI data and fine-tuning it with paired MDI/HMI data, we can enhance the resolution of MDI observations from 2"/pixel to 0.5"/pixel. We evaluate the quality of the reconstructed images by means of classical metrics (e.g., PSNR, SSIM, FID and LPIPS) and we check if physical properties, such as the unsigned magnetic flux or the size of an active region, are preserved. We compare our model with different variations of LDM and Denoising Diffusion Probabilistic models (DDPMs), but also with two deterministic architectures already used in the past for performing the Super-Resolution task. Furthermore, we show with an analysis in the Fourier domain that the LDM with residuals can resolve features smaller than 2", and due to the probabilistic nature of the LDM, we can asses their reliability, in contrast with the deterministic models. Future studies aim to super-resolve the temporal scale of the solar MDI instrument so that we can also have a better overview of the dynamics of the old events.


A Lightweight Image Super-Resolution Transformer Trained on Low-Resolution Images Only

Möller, Björn, Görnhardt, Lucas, Fingscheidt, Tim

arXiv.org Artificial Intelligence

Transformer architectures prominently lead single-image super-resolution (SISR) benchmarks, reconstructing high-resolution (HR) images from their low-resolution (LR) counterparts. Their strong representative power, however, comes with a higher demand for training data compared to convolutional neural networks (CNNs). For many real-world SR applications, the availability of high-quality HR training images is not given, sparking interest in LR-only training methods. The LR-only SISR benchmark mimics this condition by allowing only low-resolution (LR) images for model training. For a 4x super-resolution, this effectively reduces the amount of available training data to 6.25% of the HR image pixels, which puts the employment of a data-hungry transformer model into question. In this work, we are the first to utilize a lightweight vision transformer model with LR-only training methods addressing the unsupervised SISR LR-only benchmark. We adopt and configure a recent LR-only training method from microscopy image super-resolution to macroscopic real-world data, resulting in our multi-scale training method for bicubic degradation (MSTbic). Furthermore, we compare it with reference methods and prove its effectiveness both for a transformer and a CNN model. We evaluate on the classic SR benchmark datasets Set5, Set14, BSD100, Urban100, and Manga109, and show superior performance over state-of-the-art (so far: CNN-based) LR-only SISR methods. The code is available on GitHub: https://github.com/ifnspaml/SuperResolutionMultiscaleTraining.


Super-Resolution for Interferometric Imaging: Model Comparisons and Performance Analysis

Abdioglu, Hasan Berkay, Gursoy, Rana, Isik, Yagmur, Balci, Ibrahim Cem, Unal, Taha, Bayer, Kerem, Inal, Mustafa Ismail, Serin, Nehir, Kosar, Muhammed Furkan, Esmer, Gokhan Bora, Uvet, Huseyin

arXiv.org Artificial Intelligence

This study investigates the application of Super-Resolution techniques in holographic microscopy to enhance quantitative phase imaging. An off-axis Mach-Zehnder interferometric setup was employed to capture interferograms. The study evaluates two Super-Resolution models, RCAN and Real-ESRGAN, for their effectiveness in reconstructing high-resolution interferograms from a microparticle-based dataset. The models were assessed using two primary approaches: image-based analysis for structural detail enhancement and morphological evaluation for maintaining sample integrity and phase map accuracy. The results demonstrate that RCAN achieves superior numerical precision, making it ideal for applications requiring highly accurate phase map reconstruction, while Real-ESRGAN enhances visual quality and structural coherence, making it suitable for visualization-focused applications. This study highlights the potential of Super-Resolution models in overcoming diffraction-imposed resolution limitations in holographic microscopy, opening the way for improved imaging techniques in biomedical diagnostics, materials science, and other high-precision fields.


Data-driven Super-Resolution of Flood Inundation Maps using Synthetic Simulations

Aravamudan, Akshay, Rasheed, Zimeena, Zhang, Xi, Scarpignato, Kira E., Nikolopoulos, Efthymios I., Krajewski, Witold F., Anagnostopoulos, Georgios C.

arXiv.org Artificial Intelligence

The frequency of extreme flood events is increasing throughout the world. Daily, high-resolution (30m) Flood Inundation Maps (FIM) observed from space play a key role in informing mitigation and preparedness efforts to counter these extreme events. However, the temporal frequency of publicly available high-resolution FIMs, e.g., from Landsat, is at the order of two weeks thus limiting the effective monitoring of flood inundation dynamics. Conversely, global, low-resolution (~300m) Water Fraction Maps (WFM) are publicly available from NOAA VIIRS daily. Motivated by the recent successes of deep learning methods for single image super-resolution, we explore the effectiveness and limitations of similar data-driven approaches to downscaling low-resolution WFMs to high-resolution FIMs. To overcome the scarcity of high-resolution FIMs, we train our models with high-quality synthetic data obtained through physics-based simulations. We evaluate our models on real-world data from flood events in the state of Iowa. The study indicates that data-driven approaches exhibit superior reconstruction accuracy over non-data-driven alternatives and that the use of synthetic data is a viable proxy for training purposes. Additionally, we show that our trained models can exhibit superior zero-shot performance when transferred to regions with hydroclimatological similarity to the U.S. Midwest.


DC-VSR: Spatially and Temporally Consistent Video Super-Resolution with Video Diffusion Prior

Han, Janghyeok, Sim, Gyujin, Kim, Geonung, Lee, Hyunseung, Choi, Kyuha, Han, Youngseok, Cho, Sunghyun

arXiv.org Artificial Intelligence

Video super-resolution (VSR) aims to reconstruct a high-resolution (HR) video from a low-resolution (LR) counterpart. Achieving successful VSR requires producing realistic HR details and ensuring both spatial and temporal consistency. To restore realistic details, diffusion-based VSR approaches have recently been proposed. However, the inherent randomness of diffusion, combined with their tile-based approach, often leads to spatio-temporal inconsistencies. In this paper, we propose DC-VSR, a novel VSR approach to produce spatially and temporally consistent VSR results with realistic textures. To achieve spatial and temporal consistency, DC-VSR adopts a novel Spatial Attention Propagation (SAP) scheme and a Temporal Attention Propagation (TAP) scheme that propagate information across spatio-temporal tiles based on the self-attention mechanism. To enhance high-frequency details, we also introduce Detail-Suppression Self-Attention Guidance (DSSAG), a novel diffusion guidance scheme. Comprehensive experiments demonstrate that DC-VSR achieves spatially and temporally consistent, high-quality VSR results, outperforming previous approaches.