High-Frequency Enhanced Hybrid Neural Representation for Video Compression
Yu, Li, Li, Zhihui, Xiao, Jimin, Gabbouj, Moncef
–arXiv.org Artificial Intelligence
According to statistics, in 2023, more than 65% of total Internet traffic is video content (Corporation, 2023), and this percentage is expected to continue increasing. In the past, video compression was usually achieved by traditional codecs like H.264/AVC (Wiegand et al., 2003), H.265/HEVC (Sullivan et al., 2012), H.266/VVC (Bross et al., 2021), and AVS (Zhang et al., 2019). However, the handcrafted algorithms in these traditional codecs would limit the compression efficiency. With the rise of deep learning, many neural video codec (NVC) technologies have been proposed (Lu et al., 2019; Li et al., 2021; Agustsson et al., 2020; Wang et al., 2024b). These approaches replace handcrafted components with deep learning modules, achieving impressive rate-distortion performance. However, these NVC approaches have not yet achieved widespread adoption in practical applications. One reason for this is that these approaches often require a large network to achieve generalized compression over the entire data distribution, which is more computationally intensive and frequently leads to slower decoding speeds compared to traditional codecs. Moreover, the generalization capability of the network depends on the dataset used for model training, leading to poor performance on out-of-distribution (OOD) data from different domains (Zhang et al., 2021a), and even when the resolution changes. To overcome these challenges associated with NVCs, researchers have turned to implicit neural representations (INRs) as a promising alternative.
arXiv.org Artificial Intelligence
Nov-10-2024