H.264/SVC Mode Decision Based on Mode Correlation and Desired Mode List (pdf)

Article PDF cannot be displayed. You can download it here:

https://link.springer.com/content/pdf/10.1007%2Fs11633-014-0830-5.pdf

H.264/SVC Mode Decision Based on Mode Correlation and Desired Mode List

International Journal of Automation and Computing 11(5), October 2014, 510-516 DOI: 10.1007/s11633-014-0830-5 H.264/SVC Mode Decision Based on Mode Correlation and Desired Mode List L. Balaji1 1 K. K. Thyagharajan2 Faculty of Information and Communication, Anna University, Chennai 600025, India 1 RMD Engineering College, Chennai 601206, India Abstract: Design of video encoders involves implementation of fast mode decision (FMD) algorithm to reduce computation complexity while maintaining the performance of the coding. Although H.264/scalable video coding (SVC) achieves high scalability and coding eﬃciency, it also has high complexity in implementing its exhaustive computation. In this paper, a novel algorithm is proposed to reduce the redundant candidate modes by making use of the correlation among layers. A desired mode list is created based on the probability to be the best mode for each block in base layer and a candidate mode selection in the enhancement layer by the correlations of modes among reference frame and current frame. Our algorithm is implemented in joint scalable video model (JSVM) 9.19.15 reference software and the performance is evaluated based on the average encoding time, peak signal to noise ration (PSNR) and bit rate. The experimental results show 41.89% improvement in encoding time with minimal loss of 0.02 dB in PSNR and 0.05% increase in bit rate. Keywords: 1 H.264, scalable video coding, mode decision, mode correlation, rate distortion cost. Introduction Applications of multimedia through digital broadcasting over various kinds of devices (like mobile, laptop, personal data assistants (PDAs), high deﬁnition television (HDT), standard deﬁnition television (SDTV), etc.) are increasingly important. And they need a better scalability in video coding due to the variable nature of bandwidth. A scalable extension of H.264/advanced video coding (AVC) is standardized to provide best suitable video coding in 2007 as H.264 scalable video coding[1] . A reference software is developed by motion picture experts group (MPEG) and video coding experts group (VCEG) jointly called as joint video team (JVT) for scalable video coding[2, 3] . The inherent nature of spatial, temporal and signal to noise ratio (SNR) or quality scalability with respect to H.264/AVC makes H.264/scalable video coding (SVC) standardized[3] , and its performance in achieving high eﬃciency in coding is evaluated[4] . In spatial scalability, the picture with lowest spatial resolution is considered as base layer and is encoded as H.264/AVC compatible bit stream, whereas the picture with high resolution which is an unsampled residue between the original and reconstructed signal of base layer is considered enhancement layer. In temporal scalability, a hierarchical B picture approach is used for a particular spatial layer with zero structural delay. H.264/SVC constitute I, P and B pictures in which I/P picture will be the key picture and is encoded with normal intervals by only previous key picture as reference. The B picture encodes the pictures between the two key pictures. The size of group of pictures (GOP) size determines the number of temporal layers in a spatial layer, where a GOP is nothing but a key picture followed by all the temporally located Regular paper Special Issue on Massive Visual Computing Manuscript received January 13, 2014; accepted June 20, 2014 Special Issue on Massive Visual Computing pictures till the next key picture. The relation between the spatial and temporal scalability employs SNR or quality scalability which is based on diﬀerent spatio-temporal reconstruction quality levels namely coarse grain scalability (CGS) and medium grain scalability (MGS). CGS is nothing but a single temporal layer per spatial layer and MGS is multiple temporal layers per spatial layer. Although H.264/SVC with a unique bit stream adaptation to various bit rates, transmission channel bandwidth and display capabilities, achieves high scalability and high eﬃciency in coding, the computation complexity of the encoder is very high because of its inherent nature. Due to the hierarchical B picture approach in the temporal layer, it needs all the modes to be searched to be the best candidate mode prediction by full search algorithm implemented in joint scalable video model (JSVM). This is more time consuming and complex for the encoder. Focusing this issue, many research works were proposed to reduce the complexity in terms of fast mode decision (FMD) algorithm by reducing the redundant candidate mode in H.264/SVC. These works predict the redundant modes using rate distortion cost (RDC) function and the correlation among the hierarchical B picture structure. The computation complexity was eﬃciently decreased by these works with degraded video quality. But they were not suitable for sequences with large motions. Nowadays, too many hand held devices with typical structural implementations, have increasing requirement for video quality as an important issue. It is enhancement layer where the quality has to be increased. But to conserve power for hand held devices is also an important issue to be considered particularly for real time video applications. Overall, the video quality and reduction in computation complexity need to be more important while implementing any algorithm. In this paper, we focus on reduction of candidate mode L. Balaji and K. K. Thyagharajan / H.264/SVC Mode Decision Based on Mode Correlation and Desired Mode List by using probability model and mode correlation. The probability model creates a list of modes to be the best in base layer and mode correlation decides the best mode in enhancement layer. The rest of this paper is organized as follows. In Section 2, background and related works based on fast mode decision algorithms implemented in SVC, rate distortion cost procedure, probabilistic model and mode correlations were discussed. In Section 3, the proposed algorithm for complexity reduction is discussed. And the experimental results with comparative analysis are discussed in Section 4. Section 5 concludes this paper. 2 Background and related work Three new modes such as motion vectors, residuals, and intra information were introduced in the inter-layer prediction from the base layer to select the best coding mode in the enhancement layer. Based on these inter-layer prediction modes, better improvement in coding eﬃciency is achieved along with scalability. But these inter-layer modes have to do rate distortion optimization (RDO) many times which involves very high computational complexity. Particularly, residual prediction mode must be performed twice of the RDO process which increases twice the computational complexity of the normal RDO process of H.264/AVC. This complexity implementation is reduced by an eﬃcient architecture proposed in [5] by changing the processing order, here the prediction mode of reference macro block (MB) is used to predict the ca (...truncated)