SIAT VIDEO TEAM
Journal Paper
Deep Learning-Based Intra Mode Derivation for Versatile Video Coding      [PDF]

Linwei Zhu, Yun Zhang (*Corresponding Author), Na Li, Gangyi Jiang, Sam Kwong

ACM Transactions on Multimedia Computing, Communications, and Applications (ACM TOMM) , 2022, Accepted.

In intra coding, Rate Distortion Optimization (RDO) is performed to achieve the optimal intra mode from a pre-defined candidate list. The optimal intra mode is also required to be encoded and transmitted to the decoder side besides the residual signal, where lots of coding bits are consumed. To further improve the performance of intra coding in Versatile Video Coding (VVC), an intelligent intra mode derivation method is proposed in this paper, termed as Deep Learning based Intra Mode Derivation (DLIMD). In specific, the process of intra mode derivation is formulated as a multi-class classification task, which aims to skip the module of intra mode signaling for coding bits reduction....

Texture-Aware Spherical Rotation for High Efficiency Omnidirectional Intra Video Coding      [PDF]

Jinyong Pi, Yun Zhang (*Corresponding Author), Linwei Zhu, Jinzhi Lin, and Yo-Sung Ho

IEEE Transactions on Circuits and Systems for Video Technology (IEEE T-CSVT) , 2022, Accepted.

To adapt to the existing video coding standards, omnidirectional videos are usually projected from ThreeDimensional (3D) sphere to Two-Dimensional (2D) plane. However, this projection will cause geometrical stretching distortion and boundary discontinuity, which may degrade coding efficiency. In this paper, we present a Spherical Rotation based Omnidirectional Video Coding (SROVC) method, which exploits the textural properties of omnidirectional videos with spherical rotation. Firstly, SROVC framework is presented and Full-traversal Spherical Rotation (FSR) is developed to derive the optimal rotation angle with frame-level Rate Distortion Optimization (RDO)....

Joint Source-Channel Decoding of Polar Codes for HEVC-Based Video Streaming      [PDF]

Jinzhi Lin, Yun Zhang (*Corresponding Author), Na Li, and Hongling Jiang

ACM Transactions on Multimedia Computing, Communications, and Applications (ACM TOMM) , 2022, Accepted.

Ultra High-Definition (UHD) and Virtual Reality (VR) video streaming over 5G networks are emerging, in which High-Efficiency Video Coding (HEVC) is used as source coding to compress videos more efficiently and polar code is used as channel coding to transmit bitstream reliably over an error-prone channel. In this article, a novel Joint Source-Channel Decoding (JSCD) of polar codes for HEVC-based video streaming is presented to improve the streaming reliability and visual quality. Firstly, a Kernel Density Estimation (KDE) fitting approach is proposed to estimate the positions of error channel decoded bits...

Deep Learning-based Perceptual Video Quality Enhancement for 3D Synthesized View      [PDF]

Huan Zhang, Yun Zhang (*Corresponding Author), Linwei Zhu, Weisi Lin

IEEE Transactions on Circuits and Systems for Video Technology (IEEE T-CSVT) , 2022, Accepted.

Due to occlusion among views and temporal inconsistency in depth video, spatio-temporal distortion occurs in 3D synthesized video with depth image-based rendering. In this paper, we propose a deep Convolutional Neural Network (CNN)-based synthesized video denoising algorithm to reduce temporal flicker distortion and improve perceptual quality of 3D synthesized video. First, we analyze the spatio-temporal distortion, and model eliminating spatio-temporal distortion as a perceptual video denoising problem Then, a deep learningbased synthesized video denoising network is proposed , in which a CNN-friendly spatio-temporal...

High Efficiency Intra Video Coding Based on Data-driven Transform      [PDF]

Na Li,Yun Zhang(*Corresponding Author), C.C. Jay Kuo

IEEE Transactions on Broadcasting (IEEE T-BC) , 2022, Accepted.

In this work, we propose a high efficiency intra video coding based on data-driven transform, which is able to learn the source distributions of intra prediction residuals and improve the subsequent transform coding efficiency. Firstly, we model learning based transform design as an optimization problem of maximizing energy compaction or decorrelation. A data-driven Subspace Approximation with Adjusted Bias (Saab) transform is analyzed and compared with the mainstream Discrete Cosine Transform (DCT) on their energy compaction and decorrelation capabilities. Secondly, we propose a Saab transform based intra video coding framework with...