Spatial-Aided Low-Delay Wyner-Ziv Video Coding
Hindawi Publishing Corporation
EURASIP Journal on Image and Video Processing
Volume 2009, Article ID 109057, 11 pages
doi:10.1155/2009/109057
Research Article
Spatial-Aided Low-Delay Wyner-Ziv Video Coding
Bo Wu,1 Xiangyang Ji,2 Debin Zhao,3 and Wen Gao1, 4
1 Digital Media Research Center, Institute of Computing Technology, Chinese Academy of Science, Beijing 100190, China
2 Department of Automation, Tsinghua University, Beijing 100084, China
3 Department of Computer Science, Harbin Institute of Technology, Harbin 150001, China
4 Institute of Digital Media, School of Electronic Engineering and Computer Science, Peking University, Beijing 100871, China
Correspondence should be addressed to Debin Zhao,
Received 6 May 2008; Revised 28 October 2008; Accepted 12 January 2009
Recommended by Anthony Vetro
In distributed video coding, the side information (SI) quality plays an important role in Wyner-Ziv (WZ) frame coding. Usually,
SI is generated at the decoder by the motion-compensated interpolation (MCI) from the past and future key frames under
the assumption that the motion trajectory between the adjacent frames is translational with constant velocity. However, this
assumption is not always true and thus, the coding efficiency for WZ coding is often unsatisfactory in video with high and/or
irregular motion. This situation becomes more serious in low-delay applications since only motion-compensated extrapolation
(MCE) can be applied to yield SI. In this paper, a spatial-aided Wyner-Ziv video coding (WZVC) in low-delay application is
proposed. In SA-WZVC, at the encoder, each WZ frame is coded as performed in the existing common Wyner-Ziv video coding
scheme and meanwhile, the auxiliary information is also coded with the low-complexity DPCM. At the decoder, for the WZ frame
decoding, auxiliary information should be decoded firstly and then SI is generated with the help of this auxiliary information by
the spatial-aided motion-compensated extrapolation (SA-MCE). Theoretical analysis proved that when a good tradeoff between
the auxiliary information coding and WZ frame coding is achieved, SA-WZVC is able to achieve better rate distortion performance
than the conventional MCE-based WZVC without auxiliary information. Experimental results also demonstrate that SA-WZVC
can efficiently improve the coding performance of WZVC in low-delay application.
Copyright © 2009 Bo Wu et al. This is an open access article distributed under the Creative Commons Attribution License, which
permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
1. Introduction
Recently, the new applications such as wireless video
surveillance and wireless sensor network are emerging. In
these applications, a light encoder is required because the
computation and memory resources on sensors are scarce.
Furthermore, in these systems, there are always a high
number of encoders and only one or a few decoders. As a
result, the conventional hybrid video coding architectures
such as H.26x and MPEG-x, are no longer being applicable
due to the intrinsic one-to-many application model with
one high-complexity encoder and many low-complexity
decoders. In theory, distributed source coding (DSC) can
provide an ideal solution to address this problem. The
Slepian-Wolf theory shows that under certain conditions,
even if the correlated sources are encoded separately and
decoded jointly, the coding performance can be as good
as joint encoding and decoding [1]. Later, Wyner and Ziv
extended this theory to the lossy source coding with side
information (SI) at the decoder [2], which is more suitable
for practical video coding. Many researchers have applied
the practical WZ coding techniques in video coding [3–
5]. One advantage of WZ coding is that the computational
complexity of the encoder is low, such as those schemes
proposed in [4, 5]. In these schemes, the motion correlation
does not need to be exploited at the encoder and the frames
are only compressed by low-complexity channel coding
method, such as turbo codes. While at the WZ decoder, the
motion estimation with high computational complexity is
applied to exploit the temporal correlation in SI generation.
Subsequently, the errors between the original information
and the SI are corrected by using the received parity bits
transmitted from the encoder. Another advantage of WZVC
is the robustness since the WZVC system is drift-free due to
no motion estimation and motion compensation prediction
at the encoder. WZVC system is also deemed one type of
2
the joint source-channel coding systems [6] since it can be
used as a systematic lossy forward error protection method
for conventional video coding.
In [3], two typical SI generation approaches are introduced, which are motion-compensated interpolation (MCI)
and extrapolation (MCE), respectively. For MCI, SI for the
current frame is yielded by performing motion compensation on the adjacent previously and subsequently decoded
picture. However, in low-delay application, the temporally
subsequent pictures cannot be used as references to generate
SI. Therefore, MCE is adopted to generate SI in low-delay
application, in which the motion between the decoded
frames at time t2 and time t1 are estimated and the estimated
motion are used to extrapolate the SI at time t. However,
the performance of MCE-based low-delay WZVC is often
unsatisfactory because motion field cannot be well estimated
[3]. In fact, this situation can be improved by the auxiliary
information-aided method, in which partial information of
the current frame is used as the auxiliary information to
help the decoder to improve the accuracy of motion field
for MCE. In [7], one frame is partitioned into intra- and
WZmacroblocks by a pattern which is similar to H.264/AVC
FMO grouping method. The subset of intra-macroblocks is
employed as auxiliary information and helps for estimating
the SI with temporal concealment method. The auxiliary
information-aided method can also be used to improve the
quality of SI in the case of MCI. In [5], the quantized
DCT domain coefficients named hash bits are performed
as the auxiliary information. In [8], a coarse representation
of the frame is considered to assist motion estimation at
the decoder. For the above auxiliary information-aided WZ
coding schemes, significant improvements of performance
can always be achieved.
The discrete wavelet transform (DWT) are highly desirable for video coding due to their intrinsic multiresolution
structure and energy compaction property. For hybrid video
coding, DWT has been applied in many state-of-art coding
schemes to obtain the spatial scalable functionality, such as
[9, 10]. Moreover, in DVC paradigm, the DWT also has been
widely used. In [11], the author explored the high-order statistical correlation among the transform coefficients by using
DWT and SPHIT algorithms. In [12], hyperspectral images
from neighboring frequency bands are closely cor (...truncated)