ESSR: An 8K@30FPS Super-Resolution Accelerator With Edge Selective Network
Hsu, Chih-Chia, Chang, Tian-Sheuan
–arXiv.org Artificial Intelligence
--Deep learning-based super-resolution (SR) is challenging to implement in resource-constrained edge devices for resolutions beyond full HD due to its high computational complexity and memory bandwidth requirements. This paper introduces an 8K@30FPS SR accelerator with edge-selective dynamic input processing. Dynamic processing chooses the appropriate subnets for different patches based on simple input edge criteria, achieving a 50% MAC reduction with only a 0.1dB PSNR decrease. The quality of reconstruction images is guaranteed and maximized its potential with resource adaptive model switching even under resource constraints. In conjunction with hardware-specific refinements, the model size is reduced by 84% to 51K, but with a decrease of less than 0.6dB PSNR. Additionally, to support dynamic processing with high utilization, this design incorporates a configurable group of layer mapping that synergizes with the structure-friendly fusion block, resulting in 77% hardware utilization and up to 79% reduction in feature SRAM access. The implementation, using the TSMC 28nm process, can achieve 8K@30FPS throughput at 800MHz with a gate count of 2749K, 0.2075W power consumption, and 4797Mpixels/J energy efficiency, exceeding previous work. Deep learning-based super-resolution (SR) has gained prominence in recent years due to its exceptional performance. The growing demand for high-resolution (HD), ultra-HD or even 8K images in various edge device applications, including surveillance, medical imaging, virtual reality and digital entertainment, underscores its importance. Consequently, there is a pressing need for efficient hardware accelerators. V arious hardware accelerators have been proposed in recent years [1]-[5] for HD applications. However, due to the extensive computational demands and significant memory bandwidth requirements, many existing super-resolution accelerators opt for simplistic and extremely lightweight models, such as FSRCNN [6] or 1-D convolution [2], as their backbone. This often results in a compromise in both performance and perceptual quality. This work was supported by the National Science and Technology Council, Taiwan, under Grant 111-2622-8-A49-018-SB, 110-2221-E-A49-148-MY3, and 110-2218-E-A49-015-MBK.
arXiv.org Artificial Intelligence
Mar-26-2025