Transform Network Architectures for Deep Learning based End-to-End Image/Video Coding in Subsampled Color Spaces
Egilmez, Hilmi E., Singh, Ankitesh K., Coban, Muhammed, Karczewicz, Marta, Zhu, Yinhao, Yang, Yang, Said, Amir, Cohen, Taco S.
–arXiv.org Artificial Intelligence
Most of the existing deep learning based end-to-end image/video coding (DLEC) architectures are designed for non-subsampled RGB color format. However, in order to achieve a superior coding performance, many state-of-the-art block-based compression standards such as High Efficiency Video Coding (HEVC/H.265) and Versatile Video Coding (VVC/H.266) are designed primarily for YUV 4:2:0 format, where U and V components are subsampled by considering the human visual system. This paper investigates various DLEC designs to support YUV 4:2:0 format by comparing their performance against the main profiles of HEVC and VVC standards under a common evaluation framework. Moreover, a new transform network architecture is proposed to improve the efficiency of coding YUV 4:2:0 data. The experimental results on YUV 4:2:0 datasets show that the proposed architecture significantly outperforms naive extensions of existing architectures designed for RGB format and achieves about 10% average BD-rate improvement over the intra-frame coding in HEVC.
arXiv.org Artificial Intelligence
Feb-27-2021
- Country:
- Asia > Macao (0.04)
- Europe
- Netherlands > North Holland
- Amsterdam (0.04)
- United Kingdom > England
- Cambridgeshire > Cambridge (0.04)
- Netherlands > North Holland
- North America > United States
- California > San Diego County
- San Diego (0.04)
- Massachusetts
- Middlesex County > Cambridge (0.04)
- Suffolk County > Boston (0.04)
- California > San Diego County
- Genre:
- Research Report (1.00)
- Technology: