SIAT VIDEO TEAM

INTRODUCTION

Welcome to SIAT Video Team (SVT), comprised of members from High Performance Computing Center (HPCC), Shenzhen Institute of Advanced Technology (SIAT), Chinese Academy of Sciences (CAS). We have long been engaged in the field of multimedia communications and visual signal processing for 2D/3D, VR/AR videos, including video coding, visual signal pre/post-processing, and computational visual perception. We are also pursuing challenging problems in the innovation areas, such as VR/AR, AI etc.
Join Us

Video Coding and Communication

VR/AR Technology

Computational Visual Perception

Intelligent Video Analytics

RECENT PUBLICATIONS

	Deep Learning-based Perceptual Video Quality Enhancement for 3D Synthesized View
	*IEEE Transactions on Circuits and Systems for Video Technology (IEEE T-CSVT),* 2022 . Huan Zhang, Yun Zhang (*Corresponding Author), Linwei Zhu, Weisi Lin Full-Text
	High Efficiency Intra Video Coding Based on Data-driven Transform
	*IEEE Transactions on Broadcasting (IEEE T-BC),* 2021 . Na Li,Yun Zhang(*Corresponding Author), C.C. Jay Kuo Full-Text
	Joint Source-Channel Decoding of Polar Codes forHEVC based Video Streaming
	*ACM Transactions on Multimedia Computing, Communications, and Applications (ACM TOMM),* 2021 . Jinzhi Lin,Yun Zhang(*Corresponding Author), Na Li, and Hongling Jiang Full-Text
	Subjective Quality Database and Objective Study of Compressed Point Clouds with 6DoF Head-mounted Display
	*IEEE Transactions on Circuits and Systems for Video Technology (IEEE T-CSVT),* 2021 . Xinju Wu, Yun Zhang, Chunling Fan, Junhui Hou, Sam Kwong Full-Text Database Convideo
	Deep Learning Based Just Noticeable Difference and Perceptual Quality Prediction Models for Compressed Video
	*IEEE Transactions on Circuits and Systems for Video Technology (IEEE T-SCVT),* 2021 . Yun Zhang，Huanhua Liu, You Yang, Xiaoping Fan, Sam Kwong, C. C. Jay Kuo Full-Text
	Highly Efficient Multiview Depth Coding Based on Histogram Projection and Allowable Depth Distortion
	*IEEE Transactions on Image Processing (IEEE T-IP),* 2021 . Yun Zhang*, Linwei Zhu, Raouf Hamzaoi, Sam Kowng, Yo-Sung Ho Full-Text
	Projection Invariant Feature and Visual Saliency-Based Stereoscopic Omnidirectional Image Quality Assessment
	*IEEE Transactions on Broadcasting (IEEE T-BC),* 2021 . Xuemei Zhou, Yun Zhang*, Na Li, Xu Wang, Yang Zhou and Yo-Sung Ho Full-Text
	Learning-based Satisfied User Ratio Prediction for Symmetrically and Asymmetrically Compressed Stereoscopic Images
	*IEEE Multimedia (IEEE MM),* 2021 . Chunling Fan, Yun Zhang*, Raouf Hamzaoui, Qingshan Jiang, Djemei Ziou Full-Text
	Deep Learning-Based Chroma Prediction for Intra Versatile Video Coding
	*IEEE Transactions on Circuits and Systems for Video Technology (IEEE T-CSVT),* 2020 Linwei Zhu, Yun Zhang*, Shiqi Wang, Sam Kwong, Xin Jin, and Yu Qiao Full-Text
	Machine Learning Based Video Coding Optimizations: A Survey
	*Information Sciences (INS),* Elsevier, 2020. Yun Zhang, Sam Kwong*, Shiqi Wang Full-Text
	Deep Learning Based Picture-Wise Just Noticeable Distortion Prediction Model for Image Compression
	IEEE Transactions on Image Processing (IEEE T-IP). 2020 . Huanhua Liu, Yun Zhang*, Huan Zhang, Chunling Fan, Sam Kwong, C.C. Jay Kuo, and Xiaoping Fan Full-Text
	Sparse Representation based Video Quality Assessment for Synthesized 3D Videos
	IEEE Transactions on Image Processing (IEEE T-IP) . 2020 . Yun Zhang*, Huan Zhang, Mei Yu, Sam Kwong, and Yo-Sung Ho Full-Text
	Generative Adversarial Network Based Intra Prediction for Video Coding
	*IEEE Transactions on Multimedia (IEEE T-MM) .* 2020 . Linwei Zhu, Sam Kwong, Yun Zhang, Shiqi Wang, and Xu Wang Full-Text

DATABASE

	Video-based Crowd Counting Dataset in Compression Scenario Download
	We built up a Video-based Crowd Counting Dataset in Compression Scenario（VCCD-CS）for evaluating video crowd counting methodology on crowd videos with different levels of compression distortion in terms of QP. The testing set of Fudan-ShanghaiTech dataset (FDST) is selected as the source of reference videos. The FDST is a dataset for video crowd counting. It contains 150K frames with about 394K annotated heads captured from 13 different scenes. The training set consists of 60 videos, 9000 frames and the testing set contains the remaining 40 videos, 6000 frames. We encoded the frames of the 40 video sequences in the testing set of FDST, with QP∈{0, 2, 4, 6, 8, 10, 12, 14, 16, 18, 20, 22, 24, 26, 28, 30, 32, 34, 36, 38, 40, 42, 44, 46, 48, 50} in HEVC test model version 16.20 (HM16.20) under Low Delay P configuration. The “rec.yuv” is the output distorted videos, which could be decomposed with “yuvtobmp.exe” into frames for convenience of analysis in the future. Extraction code: z50m
	Subjective Point Cloud Quality Database With 6DoF Head-Mounted Display Project Page
	We focus on subjective and objective Point Cloud Quality Assessment (PCQA) in an immersive environment and study the effect of geometry and texture attributes in compression distortion. Using a Head-Mounted Display (HMD) with six degrees of freedom, we establish a subjective PCQA database named SIAT Point Cloud Quality Database (SIAT-PCQD). Our database consists of 340 distorted point clouds compressed by the MPEG point cloud encoder with the combination of 20 sequences and 17 pairs of geometry and texture quantization parameters.
	SIAT Synthesized Video Quality Database Project Page
	We develop a synthesized video quality database which includes ten different MVD sequences and 140 synthesized videos with resolutions of 1024×768 and 1920×1088. For each sequence, 14 different texture/depth quantization parameter combinations were used to generate the texture/depth view pairs with compression distortion. 56 subjects participated in the experiment. Each synthesized sequence was rated by 40 subjects using single stimulus paradigm with continuous score. The Difference Mean Opinion Scores (DMOS) are provided.
	SIAT Depth Quality Database Project Page
	We develop a stereoscopic video depth quality database which includes ten different stereoscopic sequences and 160 distorted stereo videos in with a resolution of 1920×1080. The ten sequences are from the Nantes-Madrid-3D-Stereoscopic-V1 (NAMA3DS1) database. There are four categories of impairments in the NAMA3DS1 database which are H.264 coding, JPEG2000 coding, down-sampling and sharpening. However, only symmetric distortions are considered in NAMA3DS1 database. Since both symmetric and asymmetric distortions are necessary to study, we generate additional stereoscopic videos with asymmetric distortion. There are 90 symmetrically distorted video pairs and 70 asymmetrically distorted video pairs. 30 subjects (24 male, 6 female) participated in the symmetric distortion experiment and 24 subjects (19 male, 5 female) participated in the asymmetric distortion experiment.
	Picture-level JND Database (Symmetric & Asymmetric) Project Page
	We study the Picture-level Just Noticeable Difference (PJND) of symmetrically and asymmetrically compressed stereoscopic images, , where the impaiments are JPEG2000 and H.265 intra coding. We conduct interactive subjective quality assessment tests to determine the PJND point using both a pristine image and a distorted image as the reference. We generate two PJND-based stereo image datasets, including Shenzhen Institute of Advanced Technology-picture-level Just noticeable difference-based Symmetric Stereo Image dataset (SIAT-JSSI) and Shenzhen Institute of Advanced Technology-picture-level Just noticeable difference-based Asymmetric Stereo Image dataset (SIAT-JASI). Each dataset includes ten source images. The PJNDPRI and PJNDDRI are provided. PJNDPRI reveals the minimum distortion against a pristine image. PJNDDRI reveals the minimum distortion against a distorted image.

VOD SYSTEM

	Ultra-High Definition 3D Video Live System Project Page
	3D video live and on-demand system aims to solve the issues in processing, storage and transmission, and quality evaluation, providing a realistic and immersive viewing experience. This system can be widely applied in film and television production, video games, remote control, cultural relic protection, military simulation, and other fields.
	VR/360° Video Projection Conversion Software Project Page
	Projection is one of the essential procedures in the virtual reality video/panoramic video technology. The projection format will affect the compression efficiency of panoramic video. The selection of projection format in different application scenarios can effectively reduce the video transmission bandwidth and provide customers with high-quality virtual reality experience.
	VR Video Live System Project Page
	The immersive virtual reality video live system allows customers to watch 4K Ultra High Videos on demand/live broadcast, and can provide high-quality, realistic and interactive visual experience from 360 degree viewpoint.
	JND Prediction Software Project Page
	The distortion perceptron software was developed with the PW-JND (Picture Wise Just Noticeable Difference) prediction model, in which the deep learning tool had been utilized. This software can be applied in VR image/video compression to maximize the compression efficiency without detecting the quality degraded.

Note: All resources shall not be used for commercial purposes, if you have any questions, please contact us: (yun.zhang@siat.ac.cn)

NEWS

Apr 2022	Prof. Yun Zhang will host a webinar on advances in Deep-Learning-Based Sensing, Imaging, and Video Processing on April 28. More
Feb 2022	Congratulations to Dr.Huan Zhang on her paper “Deep Learning-based Perceptual Video Quality Enhancement for 3D Synthesized View” was accepted by IEEE Transactions on Circuits and Systems for Video Technology (IEEE T-CSVT). More
Dec 2021	Prof. Yun Zhang organized the special Issue "Advances in Deep-Learning-Based Sensing, Imaging, and Video Processing". More
Dec 2021	Congratulations to Dr.Na Li on her paper “High Efficiency Intra Video Coding Based on Data-driven Transform” was accepted by IEEE Transactions on Broadcasting (IEEE T-BC). More
Dec 2021	Congratulations to Dr.Jinzhi Lin on his paper “Joint Source-Channel Decoding of Polar Codes forHEVC based Video Streaming” was accepted by ACM Transactions on Multimedia Computing, Communications, and Applications (ACM TOMM). More
July 2021	Congratulations to Ms. Xinju Wu on her paper “Subjective Quality Database and Objective Study of Compressed Point Clouds with 6DoF Head-mounted Display” was accepted by IEEE Transactions on Circuits and Systems for Video Technology. More
May 2021	Congratulations to Mr. Jinyong Pi, Ms. Xinju Wu and Ms. Xuemei Zhou for successfully passing their Master thesis defense. More
Apr 2021	Prof. Yun Zhang’s paper “Deep Learning Based Just Noticeable Difference and Perceptual Quality Prediction Models for Compressed Video” was accepted by IEEE Transactions on Circuits and Systems for Video Technology. More
Apr 2021	Book "Three Dimensional Video Processing" authored by Prof. Gangyi Jiang, Prof. Yun Zhang, Prof. Mei Yu, and Prof. Zongju Peng has been published by SCIENCE PRESS, 2020. More
Mar 2021	Congratulations that our project titled "High-efficiency Computational Theory and Method of Video Coding" has been awarded with Natural Science Award by the Minister of Education (MOE), China. This project was completed by Tongji University, CityU, NUIST and SIAT, CAS. More
Jan 2021	Prof. Yun Zhang organized the special Issue "Advances in Deep-Learning-Based Sensing, Imaging, and Video Processing". More
Dec 2020	Congratulations to Ms. Xuemei Zhou on her paper “Projection Invariant Feature and Visual Saliency Based Stereoscopic Omnidirectional Image Quality Assessment” was accepted by IEEE Transactions on Broadcasting. More
Dec 2020	Congratulations to Ms. Xinju Wu on her Audio Video coding Standard (AVS) input documents M6064 "Response to N2591" and M6065 "Subjective quality evaluation dataset of point clouds for human vision tasks" accepted by Quality Assessment Group of AVS. More
Dec 2020	Congratulations to Dr. Linwei Zhu on his paper “Circular Intra Prediction for 360 Degree Video Coding” was accepted by Journal of Visual Communication and Image Representation. More
Dec 2020	Dr. Linwei Zhu and Mr. Jinyong Pi make the oral presentation at 2020 IEEE International Conference on Visual Communications and Image Processing (VCIP). More
Dec 2020	Congratulations to Dr. Na Li for promotion to be Associate Professor at SIAT CAS, Shenzhen, China. More
Nov 2020	Congratulations to Ms. Huan Zhang for successfully defending her Ph. D. thesis. More
Nov 2020	Congratulations to Prof. Yun Zhang for being invited to give a talk entitled "Machine Learning based Video Coding Optimizations". More
Oct 2020	Congratulations to Dr. Linwei Zhu on his paper “Deep Learning-Based Chroma Prediction for Intra Versatile Video Coding” was accepted by IEEE Transactions on Circuits and Systems for Video Technology (TCSVT). More
Oct 2020	Congratulations to Prof. Yun Zhang on his paper “Highly Efficient Multiview Depth Coding Based on Histogram Projection and Allowable Depth Distortion” was accepted by IEEE Transactions on Image Processing (TIP). More
Aug 2020	Congratulations to Mr. Jinyong Pi on his paper “Content-aware Hybrid Equi-angular Cubemap Projection for Omnidirectional Video Coding” was accepted by 2020 IEEE International Conference on Visual Communications and Image Processing (VCIP). More
Aug 2020	Congratulations to Dr. Linwei Zhu on his paper “Sparse Representation-Based Intra Prediction for Lossless/Near Lossless Video Coding” was accepted by 2020 IEEE International Conference on Visual Communications and Image Processing (VCIP). More
May 2020	CCongratulations to Ms. Xiaoyan Liu for successfully defending her Master thesis. More