Three Dimensional Video Technology

数字所高性能计算研究中心

High Performance Computing Center

(HPC)

NAVIGATION

2. Projects

Topic: High Performance Multimedia Computing (Video)

3D Video System and Applications

Multiview/3D video, recorded video sequences using multiple cameras, has attracted much attention recently since it is capable of representing high quality 3D world scene, and provides new visual enjoyments beyond 2D, such as 3D depth impression and interactive selection of arbitrary viewpoint/direction within a certain range of distances. With these features and the technological advancements in display technology, it would allow many new visual media applications, such as photorealistic rendering of 3D scenes, free-viewpoint television (FTV), 3D television (3DTV) broadcasting, and 3D games, to provide exciting visual functions for users. Our research group mainly investigates the following three topics of the 3D video technology:

Advanced 3D Video Coding Theory and Optimizations
Interactive View Generation and 3D Rendering
3D Visual Perception and 3D Image Quality Assessment

Research Projects

1. Advanced 3D Video Coding Theory and Optimizations

Multiview videos captured simultaneously from multiple cameras at slightly different views or angles are required for representation of genuine 3D world video content and thus a tremendously huge amount of storage space, transmission bandwidth and computational resources for compression are needed as compared with traditional mono-view videos. To compress the large volume of multiview video data efficiently, Multiview Video Coding (MVC) has been developed based on the state-of-the-art H.264/AVC (Advanced Video Coding) or High Efficient Video Coding (HEVC) standard. In this research, our research team members mainly investigate the following three topics: 1) exploring the visual perception redundancies as well as view-spatial-temporal correlation/redundancies in multiview video to improve the coding efficiency, network adaptation and interactive functionalities; 2) developing highly efficient optimization techniques to lower the computational complexity of the codec for real-time 3D broadcasting applications and 3D video communication; 3) investigating highly efficient multiview depth video coding with the target of improving its efficiency and reducing coding complexity. Details can be referred in [1]-[21], [24]-[29] and patents [41]-[43]. more...

2. Depth Image Based 3D View Rendering and Processing

In 3D video system, Depth Image Based Rendering (DIBR) is adopted to generate arbitrary viewpoint image of a scene with multiview color video and corresponding depth video. However, artifacts of the rendered images, including holes caused by occlusion/disclosure and boundary artifacts, may degrade the subjective and objective image quality. In addition, real time processing for the virtual view rendering is extremely challenging as the image resolution increases. In this project, we investigate advanced algorithm for view rendering and pre/post-processing algorithm to handle rendering artifacts and improve the rendering quality. In addition, fast arbitrary view rendering methods are investigated in algorithm level and hardware structure level, e.g. FPGA, GPU+CPU. Details can be referred in [18]-[19], [31] and [44]-[45]. more....

3. 3D Visual Perception and 3D Image Quality Assessment

3D video system provide unique real 3D depth perception and new types of visual enjoyments. However, though the 3D video is consists of multiview video, the quality of human visual perception on 3D video is not a simple addition from the image qualities of two or multiview video. Thus, in this project, we investigates visual perception factors including visual attention, depth perception, masking effect, etc, in 3D video and providing objective/subjective 3D image/video quality assessment metrics. Then, based on the discoveries on 3D perceptual redundancies, we feed it back to design high efficiency 3D video coding and processing algorithm. Details can be referred in [20] - [23], [35]-[40] and [46]-[47]. more...

Top...

Research Group Members (Join us)

Team Members in Our Video Group

Dr. Yun Zhang, Ph.D, ICT/CAS, MIEEE, MACM, Associate Professor, Master Supervisor.

Linwei Zhu, M.Phi, Research Assistant

Xiangkai Liu, Ph.D Student, Visiting Student

Xiaoxiang Yang, M.Phi, Visiting Student

Collaborators

Prof. Sam Kwong, FIEEE, Head of Department of Computer Science, City University of Hong Kong

Prof. Gangyi Jiang, Vice president, Faculty of Information Science and Engineering, Ningbo University

Top...

Selected Publications

Journal Paper

[1] Y. Zhang, S. Kwong, L. Xu, S. Hu, G. Jiang and C.-C. J. Kuo, Regional Bit Allocation and Rate Distortion Optimization for Multiview Depth Video Coding With View Synthesis Distortion Model, IEEE Transactions on Image Processing (IEEE T-IP), June. 2013. (Accepted) (SCI IF 3.042)

[2] Y. Zhang, S. Kwong, L. Xu, and G. Jiang, DIRECT mode early decision optimization based on rate distortion cost property and inter-view correlation, IEEE Transactions on Broadcasting (IEEE T-BC), Vol. 59, No.2, pp. 390-398, May. 2013, (SCI, IF 2.242)

[3] X. Wang, S. Kwong, Y. Zhang, Applying Game Theory to Rate Control Optimization for Hierarchical B-pictures, IEEE Transactions on Broadcasting (IEEE T-BC), March, 2013,(in press) (SCI IF 2.242)

[4] S. Hu, S. Kwong, Y. Zhang, and C.-C. J. Kuo, Rate-Distortion Optimized Rate Control for Depth Map based 3D Video Coding, IEEE Transactions on Image Processing (IEEE T-IP), Vol.22, No.2, pp.585-594, Feb. 2013, (SCI, IF 3.042)

[5] Z. Pan, Y. Zhang, S. Kwong, Fast mode decision based on texture-depth correlation and motion prediction for multiview depth video coding, Journal of Real-Time Image Processing (JRTIP), March, 2013, (in press) (SCI IF 0.962)

[6] Y. Zhang, S. Kwong, G. Jiang, X. Wang, and M. Yu, Statistical early termination model for fast mode decision and reference frame selection in multiview video coding, IEEE Transactions on Broadcasting (IEEE T-BC), vol.58, no.1, pp.10-23, March 2012. (SCI, IF 2.242)

[7] Z. Pan, S. Kwong, L. Xu, Y. Zhang, and T. Zhao, Predictive and distribution-oriented fast motion estimation for H.264/AVC, Journal of Real-Time Image Processing (JRTIP), Springer, June 2012,(in press) (SCI, EI, IF 0.962)

[8] L. Xu, S. Kwong Y. Zhang, and D. Zhao Low-complexity encoder framework for window-Level rate control optimization, IEEE Transactions on Industrial Electronics (IEEE T-IE), Vol.60, No.5, pp.1850-1858, May. 2013 (SCI, IF 5.16)

[9] L. Xu, S. Kwong, H. Wang, Y. Zhang, Debin. Zhao, and Wen. Gao, A universal rate control scheme for transcoding, IEEE Transactions on Circuits and System for Video Technology (IEEE T-CSVT), Vol. 22, No.4, pp. 489 - 501, Apr. 2012. (SCI, IF 2.548)

[10]Y. Zhang, S. Kwong, G. Jiang, and H. Wang, Efficient multi-reference frame selection algorithm for hierarchical B frames in multiview video coding, IEEE Transactions on Broadcasting (IEEE T-BC), vol.57, no.1, pp.15-24, March 2011. (SCI, IF 2.242)

[11] Y. Zhang, G. Jiang, M. Yu, Low-complexity quantization for H.264/AVC, Journal of Real-Time Image Processing, Springer, vol. 4, no.1, pp 3-12, March, 2009. (SCI, IF 0.962)

[12] Y. Zhang, G. Jiang, M. Yu, and Y. S. Ho, Adaptive multiview video coding scheme based on spatiotemporal correlation analyses, ETRI Journal, vol. 31, no. 2, pp. 151-161, Apr. 2009. (SCI, IF 0.814)

[13] Y. Zhang, M. Yu, and G. Jiang, New approach to multi-modal multiview video coding, Chinese Journal of Electronics, vol.18, no.2, pp.338-342, 2009. (SCI, IF 0.148)

[14] 蒋刚毅, 张云, 郁梅, 基于相关性分析的多模式多视点视频编码方法,计算机学报, 第30卷第12期，第2205-2211页, 2007年12月（EI期刊）

[15] Y. Zhang, M. Yu, and G. Jiang, Evaluation of typical prediction structures for multi-view video coding, ISAST Transactions on Electronics and Signal Processing, vol.2, no.1, 2008, pp.7-15.

[16] M. Yu, P. He, Z. Peng, Y. Zhang, Y. Si, and G. Jiang, Fast macroblock mode selection algorithm for B Frames in multiview video coding, KSII Transactions on Internet and Information Systems, vol. 5, no. 2, pp.408-427, February 2011. (SCI, IF 0.200)

[17] Z. Peng, M. Yu, G. Jiang, F. Shao, Y. Zhang, and Y. Yang, A fast macroblock mode selection algorithm for multiview depth video coding, Chinese Optics Letters, vol.8, no. 2, pp.151-154, Feb. 2010. (SCI, IF 0.804)

[18] F. Shao, G. Jiang, M. Yu, and Y. Zhang, Object-based depth image based rendering for a three-dimensional video system by color-correction optimization, Optical Engineering, vol. 50, no.4, article 047006, Apr. 2011. (SCI, IF 0.553)

[19] 朱波, 蒋刚毅, 张云, 郁梅, 面向虚拟视点图像绘制的深度图编码算法,光电子.激光, 21(5):718-724, 2010.(EI)

[20] Y. Zhang, G. Jiang, M. Yu, Y. Yang, Z. Peng, and K. Chen, Depth perceptual region-of-interest based multiview video coding, Journal of Visual Communication and Image Representation, Elsevier, vol. 21, no. 5-6, pp. 498-512, Jul.-Aug. 2010. (SCI, IF 1.326)

[21] Y. Zhang, G. Jiang, M. Yu, K. Chen, and Q. Dai, Stereoscopic visual attention based bit allocation optimization for multiview video coding, EURASIP Journal on Advances in Signal Processing, Volume 2010, Article ID 848713, 24 pages, 2010. (SCI, IF 1.012)

[22] X. Wang, G. Jiang, J. Zhou, Y. Zhang, F. Shao, Z. Peng, and M. Yu, Just noticeable difference of compressed stereoscopic image: effects of asymmetric coding, Imaging Science Journal, 2011.(in press) (SCI, IF 0.169)

Conference Paper

[23] Z. Zheng, Y. Zhang, Q. Chen, Salient Object Detection Using Background and Foreground Prior, IEEE International Conference on Image Processing (ICIP2013), Melbourne, Australia, Jan 2013, Accepted (Oral) (EI)

[24] Z. Pan, Y. Zhang, S. Kwong, X. Wang, L. Xu, Early termination fro TZSearch in HEVC Motion Estimation, IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP2013), Vancouver, Canada, May, 2013, Accepted. (Oral)(EI)

[25] T. Zhao, Y. Zhang, Sam Kwong, Hanli Wang, and Qian Chen, Adaptive Rate-Distortion Prediction for Multiple Reference Selection and Inter-Mode Decision, Pacific-Rim Conference on Multimedia (PCM2012),Lecture Notes in Computer Sciences, Singapore, Dec. 2012.(Oral)(EI)

[26] D. Mao, Y. Zhang, G. Chen, and S. Kwong, Disparity and Motion Activity Based Mode Prediction for Fast Mode Decision in Multiview Video Coding, Pacific-Rim Conference on Multimedia (PCM2011), Lecture Notes in Computer Sciences, Sydney, Dec. 2011. (EI Oral)

[27] Z. Pan, S. Kwong and Y. Zhang, A Multiple Hexagon Search Algorithm for Motion and Disparity Estimation in Multiview Video Coding, Pacific-Rim Conference on Multimedia (PCM2011), Lecture Notes in Computer Sciences, Sydney, Dec. 2011. (EI, Oral)

[28] Y. Zhang, G. Jiang, W. Ye, M. Yu, F. Li, and T. Choi, An improved design of quantization for H.264 video coding standard, International Conference on Signal Processing (ICSP06), Guilin, China, pp.1401-1404, Nov. 2006. (EI)

[29] Y. Zhang, G. Jiang , W. Ye, M. Yu , Z. Jiang, and Y. D. Kim, An approach to multi-mode multi-view video coding, International Conference on Signal Processing (ICSP06), Guilin, China, pp.1405-1408, Nov. 2006.(EI)

[23] Z. Peng, M. Yu, G. Jiang, F. Shao, Y. Zhang, Z. Jiang; F. Li, Depth sequence coding based on a joint MVD scheme, ISO/IEC JTC1/SC29 WG11 MPEG, m16852, Xi’an, China, Oct 2009

[31] B. Zhu, G. Jiang, Y. Zhang, Z. Peng, and M. Yu, View synthesis oriented depth map coding algorithm, Asia-Pacific Conference on Information Processing (APCIP2009), ShenZhen, China, Jul. 18-19, 2009. (EI)

[32] B. Zhu, G. Jiang, Y. Zhang, F. Shao, Z. Peng. Fast Block Based Virtual View Synthesis Algorithm. ISO/IEC JTC1/SC29/WG11, MPEG2009/M 16850, October, 2009, Xi an, China.

[33] X. Lu, M. Yu, Y. Zhang, and G. Jiang, Motion detection based on temporal-to-spatial conversion of depth maps for multi-view video, International Conference on Wireless Communication and Signal Processing (WCSP2009), Nanjing, China, p1-5, Nov. 2009. (EI)

[34] F. Shao, G. Jiang, M. Yu, Z. Peng, Y. Zhang, Z. Jiang, F. Li, Color Correction Preprocessing and Chrominance Reconstruction Post processing for Multiview video coding, ISO/IEC JTC1/SC29 WG11 MPEG, m16886, Xi’an, China, Oct 2009.

[35] Xu Wang, S. Kwong, Y. Zhang, Considering Binocular Spatial Sensitivity in Stereoscopic Image Quality Assessment, IEEE Conference on Visual Communications and Image Processing (VCIP2011), Lecture Notes in Computer Sciences, Taiwan, Nov. 2011.( EI, Invited Paper, Oral)

[36] Y. Zhang, G. Jiang, M. Yu, and K. Chen, Stereoscopic visual attention model for 3D video, in Proc. The 16th International Multimedia Modeling Conference (MMM2010), Lecture Notes in Computer Sciences, vol. 5916, Springer Verlag, Chongqing, China, pp. 314-324, Jan. 2010. (EI)

[37] Y. Zhang, G. Jiang, M. Yu, and K. Chen, Depth-spatio-temporal joint region-of-interest extraction and tracking for 3D video, The 2009 International Conference on Future Generation Information Technology (FGIT 2009), Lecture Notes in Computer Sciences, vol. 5899, Springer Verlag, pp. 268-276, Jeju Island, Korea, Dec., 2009. (EI)

[38] Y. Zhang, M. Yu, G. Jiang, Z. Peng, and You Yang, Low-complexity region-of-interest extraction for multiview video coding, The 2nd International Congress on Image and Signal Processing (CISP 2009), Tianjin, China., pp.282-286, Oct. ,2009. (EI)

[39] Y. Zhang, M. Yu, and G. Jiang, Depth based region of interest extraction for multi-view video coding, International Conference on Machine Learning and Cybernetics (ICMLC 2009), vol.4, pp. 2221 - 2227, Baoding, China, July, 2009. (EI)

[40] J. Zhou, G. Jiang, X. Mao, M. Yu, F. Shao, Z. Peng, Y. Zhang, Subjective quality analyses of stereoscopic images in 3DTV system. IEEE Conference on Visual Communications and Image Processing (VCIP2011), pp. 1-4, Taiwan, Nov, 2011

CN Patents

[41] 张云，赵铁松，陈先开，王志强，陈前，多视点视频信号的编码方法，专利受理号 201110459761.X

[42] 蒋刚毅，张云，郁梅，一种用于视频图像编码过程中的量化方法，国家发明专利号 ZL 200510060793.7，国家专利局

[43] 蒋刚毅，郁梅，张云，多模式多视点视频信号编码压缩方法，国家发明专利号 ZL 200610052895.9，国家专利局

[44] 张云、朱林卫、蒋刚毅、陈前，虚拟视点图像后处理方法，国家发明专利，受理号 201210132641.3

[45] 张云、朱林卫、陈前，一种编解码系统的图像与视频重构方法，国家发明专利，受理号201210460004.9

[46] 张云，蒋刚毅，郁梅，一种基于深度的视频感兴趣区域提取方法，发明专利号 ZL200910099706.7，国家专利局

[47] 张云，蒋刚毅，郁梅，一种基于视觉注意的视频感兴趣区域提取方法，发明专利号 ZL 200910152520.3，国家专利局

Top...

Research Grants

National Natural Science Foundation (NSFC) of China , Quality-of-Experience Based High Efficiency 3D Video Coding, Jan. 2015 ~ Dec.2018, PI
NSFC of China, Multiview Video Coding Research Exploiting Regional Selective Visual Redundancies, Jan. 2012 ~ Dec. 2014, PI
Shenzhen Match Funding for NSFC, Multiview Video Coding Research Exploiting Regional Selective Visual Redundancies, Jan. 2014 ~ Dec. 2014, PI
Guangdong Provincial Natural Science Foundation, Video Compression and Reconstruction Technology for Mobile 3D Communication System, Oct. 2012 ~ Sept. 2014, PI
Shenzhen Emerging Industries of the Strategic Basic Research Grant, Multiview Video Coding and View Rendering Optimization for 3D Video System, Jan. 2013 ~ Dec. 2014, PI
Shenzhen, NanShan Key Technologies Research Grant, R&D for Key Technologies in Secure Video Streaming, Sept, 2013 ~ Aug, 2015, Co-PI,

Top...

Experimental Setup/Equipments

3D Video Capture

		Stereoscopic video capturing system, which is consists of two or more mono-view cameras (Flea2, 1024x768@30fps) with SW synchroni-zation or one stereo-camera (Bumblebee2, 640x 480@25fps with 2 views). For the stereo-camera, the depth video can be generated by its HW stereo-matching based algorithm.
		Multiview video capturing system, 9 views, 1024x768@25fps for each view, YUV 4:2:0, both HW/SW synchronization are supported and the synchronization error is within 50us of using HW. Frame rate step varies from1 to 60. Remote control and remote parameter settings including brightness, frame rate, number of capturing frame, shutter, contrast, saturation, directory for storage etc. are supported. This system is equipped in our collaborator Prof. Jiang's research team.

3D Display System

Stereoscopic video player, it is developed by using Directshow library and C++. It supports 2 view video capturing, encoding and transmission/ storage. Coding formats including MPEG-2/MPEG-4/Xvid/H.264 are supported. It also supports 3D video playback, decoding and 3D display. Filters are developed to support different input data formats (such as top-bottom, left-right, interlaced etc) and output interfacts (such as red-cyan, red-blue for traditional LCD display or polarized stereo display system).

Anaglyph (with red-cyan glasses)

		Polarized Steroscopic Display System. This system is equipped with Barco 3D projector (Glaxy 7 Classic+, DLP) and active shutter glasses. The projector is with 7000 lumns in brightness and 1600:1 in contrast, supports XVGA resolution 1024x768. The superior stereo separation technology combining high contrast with excellent stereo separation.

Polarized Steroscopic Display System (Active)

		Autostereoscopic (Glasses free) Multi-view 3D display, Screen size 46 inch with 1920x1080 resolution, lenticular lens, contrast 1800:1, input data format includes 8 or 9-view/2-view/ 1view+1depth. It can provide autostereoscopic real 3D depth perception, and its maximum visual depth is about 1.5 m.


Autostereoscopic Multiview Video Display

Top...

Contacts

If you are interested in us, you may contact Yun Zhang with email yun.zhang@siat.ac.cn, address, RM706, Section B, 1068 Xueyuan Boulevard, University Town of Shenzhen, Xili Nanshan, Shenzhen 518055, China

Top...

Free Web Site Counters

Created on Dec. 1st, 2012, updated on Nov. 1st, 2013