INTRODUCTION

Welcome to SIAT Video Team (SVT), comprised of members from High Performance Computing Center (HPCC), Shenzhen Institute of Advanced Technology (SIAT), Chinese Academy of Sciences (CAS). We have long been engaged in the field of multimedia communications and visual signal processing for 2D/3D, VR/AR videos, including video coding, visual signal pre/post-processing, and computational visual perception. We are also pursuing challenging problems in the innovation areas, such as VR/AR, AI etc.
Join Us      

RECENT PUBLICATIONS
Deep Learning-based Perceptual Video Quality Enhancement for 3D Synthesized View

IEEE Transactions on Circuits and Systems for Video Technology (IEEE T-CSVT), 2022 .

Huan Zhang, Yun Zhang (*Corresponding Author), Linwei Zhu, Weisi Lin

Full-Text     
High Efficiency Intra Video Coding Based on Data-driven Transform

IEEE Transactions on Broadcasting (IEEE T-BC), 2021 .

Na Li,Yun Zhang(*Corresponding Author), C.C. Jay Kuo

Full-Text     
Joint Source-Channel Decoding of Polar Codes forHEVC based Video Streaming

ACM Transactions on Multimedia Computing, Communications, and Applications (ACM TOMM), 2021 .

Jinzhi Lin,Yun Zhang(*Corresponding Author), Na Li, and Hongling Jiang

Full-Text     
Subjective Quality Database and Objective Study of Compressed Point Clouds with 6DoF Head-mounted Display

IEEE Transactions on Circuits and Systems for Video Technology (IEEE T-CSVT), 2021 .

Xinju Wu, Yun Zhang, Chunling Fan, Junhui Hou, Sam Kwong

Full-Text           Database          Convideo     
Deep Learning Based Just Noticeable Difference and Perceptual Quality Prediction Models for Compressed Video

IEEE Transactions on Circuits and Systems for Video Technology (IEEE T-SCVT), 2021 .

Yun Zhang,Huanhua Liu, You Yang, Xiaoping Fan, Sam Kwong, C. C. Jay Kuo

Full-Text     
Highly Efficient Multiview Depth Coding Based on Histogram Projection and Allowable Depth Distortion

IEEE Transactions on Image Processing (IEEE T-IP), 2021 .

Yun Zhang*, Linwei Zhu, Raouf Hamzaoi, Sam Kowng, Yo-Sung Ho

Full-Text     
Projection Invariant Feature and Visual Saliency-Based Stereoscopic Omnidirectional Image Quality Assessment     

IEEE Transactions on Broadcasting (IEEE T-BC), 2021 .

Xuemei Zhou, Yun Zhang*, Na Li, Xu Wang, Yang Zhou and Yo-Sung Ho

Full-Text     
Learning-based Satisfied User Ratio Prediction for Symmetrically and Asymmetrically Compressed Stereoscopic Images     

IEEE Multimedia (IEEE MM), 2021 .

Chunling Fan, Yun Zhang*, Raouf Hamzaoui, Qingshan Jiang, Djemei Ziou

Full-Text     
Deep Learning-Based Chroma Prediction for Intra Versatile Video Coding     

IEEE Transactions on Circuits and Systems for Video Technology (IEEE T-CSVT), 2020

Linwei Zhu, Yun Zhang*, Shiqi Wang, Sam Kwong, Xin Jin, and Yu Qiao

Full-Text     
Machine Learning Based Video Coding Optimizations: A Survey     

Information Sciences (INS), Elsevier, 2020.

Yun Zhang, Sam Kwong*, Shiqi Wang

Full-Text     
Deep Learning Based Picture-Wise Just Noticeable Distortion Prediction Model for Image Compression     

IEEE Transactions on Image Processing (IEEE T-IP). 2020 .

Huanhua Liu, Yun Zhang*, Huan Zhang, Chunling Fan, Sam Kwong, C.C. Jay Kuo, and Xiaoping Fan

Full-Text     
Sparse Representation based Video Quality Assessment for Synthesized 3D Videos     

IEEE Transactions on Image Processing (IEEE T-IP) . 2020 .

Yun Zhang*, Huan Zhang, Mei Yu, Sam Kwong, and Yo-Sung Ho

Full-Text     
Generative Adversarial Network Based Intra Prediction for Video Coding     

IEEE Transactions on Multimedia (IEEE T-MM) . 2020 .

Linwei Zhu, Sam Kwong, Yun Zhang, Shiqi Wang, and Xu Wang

Full-Text     
DATABASE
Video-based Crowd Counting Dataset in Compression Scenario         Download         

We built up a Video-based Crowd Counting Dataset in Compression Scenario(VCCD-CS)for evaluating video crowd counting methodology on crowd videos with different levels of compression distortion in terms of QP. The testing set of Fudan-ShanghaiTech dataset (FDST) is selected as the source of reference videos. The FDST is a dataset for video crowd counting. It contains 150K frames with about 394K annotated heads captured from 13 different scenes. The training set consists of 60 videos, 9000 frames and the testing set contains the remaining 40 videos, 6000 frames. We encoded the frames of the 40 video sequences in the testing set of FDST, with QP∈{0, 2, 4, 6, 8, 10, 12, 14, 16, 18, 20, 22, 24, 26, 28, 30, 32, 34, 36, 38, 40, 42, 44, 46, 48, 50} in HEVC test model version 16.20 (HM16.20) under Low Delay P configuration. The “rec.yuv” is the output distorted videos, which could be decomposed with “yuvtobmp.exe” into frames for convenience of analysis in the future. Extraction code: z50m

Subjective Point Cloud Quality Database With 6DoF Head-Mounted Display      Project Page     

We focus on subjective and objective Point Cloud Quality Assessment (PCQA) in an immersive environment and study the effect of geometry and texture attributes in compression distortion. Using a Head-Mounted Display (HMD) with six degrees of freedom, we establish a subjective PCQA database named SIAT Point Cloud Quality Database (SIAT-PCQD). Our database consists of 340 distorted point clouds compressed by the MPEG point cloud encoder with the combination of 20 sequences and 17 pairs of geometry and texture quantization parameters.

SIAT Synthesized Video Quality Database Project Page     

We develop a synthesized video quality database which includes ten different MVD sequences and 140 synthesized videos with resolutions of 1024×768 and 1920×1088. For each sequence, 14 different texture/depth quantization parameter combinations were used to generate the texture/depth view pairs with compression distortion. 56 subjects participated in the experiment. Each synthesized sequence was rated by 40 subjects using single stimulus paradigm with continuous score. The Difference Mean Opinion Scores (DMOS) are provided.

SIAT Depth Quality Database      Project Page     

We develop a stereoscopic video depth quality database which includes ten different stereoscopic sequences and 160 distorted stereo videos in with a resolution of 1920×1080. The ten sequences are from the Nantes-Madrid-3D-Stereoscopic-V1 (NAMA3DS1) database. There are four categories of impairments in the NAMA3DS1 database which are H.264 coding, JPEG2000 coding, down-sampling and sharpening. However, only symmetric distortions are considered in NAMA3DS1 database. Since both symmetric and asymmetric distortions are necessary to study, we generate additional stereoscopic videos with asymmetric distortion. There are 90 symmetrically distorted video pairs and 70 asymmetrically distorted video pairs. 30 subjects (24 male, 6 female) participated in the symmetric distortion experiment and 24 subjects (19 male, 5 female) participated in the asymmetric distortion experiment.

Picture-level JND Database (Symmetric & Asymmetric)      Project Page     

We study the Picture-level Just Noticeable Difference (PJND) of symmetrically and asymmetrically compressed stereoscopic images, , where the impaiments are JPEG2000 and H.265 intra coding. We conduct interactive subjective quality assessment tests to determine the PJND point using both a pristine image and a distorted image as the reference. We generate two PJND-based stereo image datasets, including Shenzhen Institute of Advanced Technology-picture-level Just noticeable difference-based Symmetric Stereo Image dataset (SIAT-JSSI) and Shenzhen Institute of Advanced Technology-picture-level Just noticeable difference-based Asymmetric Stereo Image dataset (SIAT-JASI). Each dataset includes ten source images. The PJNDPRI and PJNDDRI are provided. PJNDPRI reveals the minimum distortion against a pristine image. PJNDDRI reveals the minimum distortion against a distorted image.

VOD SYSTEM
Ultra-High Definition 3D Video Live System Project Page     

3D video live and on-demand system aims to solve the issues in processing, storage and transmission, and quality evaluation, providing a realistic and immersive viewing experience. This system can be widely applied in film and television production, video games, remote control, cultural relic protection, military simulation, and other fields.

VR/360° Video Projection Conversion Software Project Page     

Projection is one of the essential procedures in the virtual reality video/panoramic video technology. The projection format will affect the compression efficiency of panoramic video. The selection of projection format in different application scenarios can effectively reduce the video transmission bandwidth and provide customers with high-quality virtual reality experience.

VR Video Live System Project Page     

The immersive virtual reality video live system allows customers to watch 4K Ultra High Videos on demand/live broadcast, and can provide high-quality, realistic and interactive visual experience from 360 degree viewpoint.

JND Prediction Software Project Page     

The distortion perceptron software was developed with the PW-JND (Picture Wise Just Noticeable Difference) prediction model, in which the deep learning tool had been utilized. This software can be applied in VR image/video compression to maximize the compression efficiency without detecting the quality degraded.

Note: All resources shall not be used for commercial purposes, if you have any questions, please contact us: (yun.zhang@siat.ac.cn)

NEWS

Apr
2022

Prof. Yun Zhang will host a webinar on advances in Deep-Learning-Based Sensing, Imaging, and Video Processing on April 28. More     

Feb
2022

Congratulations to Dr.Huan Zhang on her paper “Deep Learning-based Perceptual Video Quality Enhancement for 3D Synthesized View” was accepted by IEEE Transactions on Circuits and Systems for Video Technology (IEEE T-CSVT). More     

Dec
2021

Prof. Yun Zhang organized the special Issue "Advances in Deep-Learning-Based Sensing, Imaging, and Video Processing". More     

Dec
2021

Congratulations to Dr.Na Li on her paper “High Efficiency Intra Video Coding Based on Data-driven Transform” was accepted by IEEE Transactions on Broadcasting (IEEE T-BC). More     

Dec
2021

Congratulations to Dr.Jinzhi Lin on his paper “Joint Source-Channel Decoding of Polar Codes forHEVC based Video Streaming” was accepted by ACM Transactions on Multimedia Computing, Communications, and Applications (ACM TOMM). More     

July
2021

Congratulations to Ms. Xinju Wu on her paper “Subjective Quality Database and Objective Study of Compressed Point Clouds with 6DoF Head-mounted Display” was accepted by IEEE Transactions on Circuits and Systems for Video Technology. More     

May
2021

Congratulations to Mr. Jinyong Pi, Ms. Xinju Wu and Ms. Xuemei Zhou for successfully passing their Master thesis defense. More     

Apr
2021

Prof. Yun Zhang’s paper “Deep Learning Based Just Noticeable Difference and Perceptual Quality Prediction Models for Compressed Video” was accepted by IEEE Transactions on Circuits and Systems for Video Technology. More     

Apr
2021

Book "Three Dimensional Video Processing" authored by Prof. Gangyi Jiang, Prof. Yun Zhang, Prof. Mei Yu, and Prof. Zongju Peng has been published by SCIENCE PRESS, 2020. More     

Mar
2021

Congratulations that our project titled "High-efficiency Computational Theory and Method of Video Coding" has been awarded with Natural Science Award by the Minister of Education (MOE), China. This project was completed by Tongji University, CityU, NUIST and SIAT, CAS. More     

Jan
2021

Prof. Yun Zhang organized the special Issue "Advances in Deep-Learning-Based Sensing, Imaging, and Video Processing". More     

Dec
2020

Congratulations to Ms. Xuemei Zhou on her paper “Projection Invariant Feature and Visual Saliency Based Stereoscopic Omnidirectional Image Quality Assessment” was accepted by IEEE Transactions on Broadcasting. More     

Dec
2020

Congratulations to Ms. Xinju Wu on her Audio Video coding Standard (AVS) input documents M6064 "Response to N2591" and M6065 "Subjective quality evaluation dataset of point clouds for human vision tasks" accepted by Quality Assessment Group of AVS. More     

Dec
2020

Congratulations to Dr. Linwei Zhu on his paper “Circular Intra Prediction for 360 Degree Video Coding” was accepted by Journal of Visual Communication and Image Representation. More     

Dec
2020

Dr. Linwei Zhu and Mr. Jinyong Pi make the oral presentation at 2020 IEEE International Conference on Visual Communications and Image Processing (VCIP). More     

Dec
2020

Congratulations to Dr. Na Li for promotion to be Associate Professor at SIAT CAS, Shenzhen, China. More     

Nov
2020

Congratulations to Ms. Huan Zhang for successfully defending her Ph. D. thesis. More     

Nov
2020

Congratulations to Prof. Yun Zhang for being invited to give a talk entitled "Machine Learning based Video Coding Optimizations". More     

Oct
2020

Congratulations to Dr. Linwei Zhu on his paper “Deep Learning-Based Chroma Prediction for Intra Versatile Video Coding” was accepted by IEEE Transactions on Circuits and Systems for Video Technology (TCSVT). More     

Oct
2020

Congratulations to Prof. Yun Zhang on his paper “Highly Efficient Multiview Depth Coding Based on Histogram Projection and Allowable Depth Distortion” was accepted by IEEE Transactions on Image Processing (TIP). More     

Aug
2020

Congratulations to Mr. Jinyong Pi on his paper “Content-aware Hybrid Equi-angular Cubemap Projection for Omnidirectional Video Coding” was accepted by 2020 IEEE International Conference on Visual Communications and Image Processing (VCIP). More     

Aug
2020

Congratulations to Dr. Linwei Zhu on his paper “Sparse Representation-Based Intra Prediction for Lossless/Near Lossless Video Coding” was accepted by 2020 IEEE International Conference on Visual Communications and Image Processing (VCIP). More     

May
2020

CCongratulations to Ms. Xiaoyan Liu for successfully defending her Master thesis. More