Facial Image Compression Based on Structured Codebooks in Overcomplete Domain

Mar 2006

We advocate facial image compression technique in the scope of distributed source coding framework. The novelty of the proposed approach is twofold: image compression is considered from the position of source coding with side information and, contrarily to the existing scenarios where the side information is given explicitly; the side information is created based on a deterministic approximation of the local image features. We consider an image in the overcomplete transform domain as a realization of a random source with a structured codebook of symbols where each symbol represents a particular edge shape. Due to the partial availability of the side information at both encoder and decoder, we treat our problem as a modification of the Berger-Flynn-Gray problem and investigate a possible gain over the solutions when side information is either unavailable or available at the decoder. Finally, the paper presents a practical image compression algorithm for facial images based on our concept that demonstrates the superior performance in the very-low-bit-rate regime.

Article PDF cannot be displayed. You can download it here:

https://link.springer.com/content/pdf/10.1155%2FASP%2F2006%2F69042.pdf

Facial Image Compression Based on Structured Codebooks in Overcomplete Domain

Hindawi Publishing Corporation EURASIP Journal on Applied Signal Processing Volume 2006, Article ID 69042, Pages 1–11 DOI 10.1155/ASP/2006/69042 Facial Image Compression Based on Structured Codebooks in Overcomplete Domain J. E. Vila-Forcén, S. Voloshynovskiy, O. Koval, and T. Pun Stochastic Image Processing Group, CUI, University of Geneva, 24 rue du Général-Dufour, Geneva 1211, Switzerland Received 31 July 2004; Revised 16 June 2005; Accepted 27 June 2005 We advocate facial image compression technique in the scope of distributed source coding framework. The novelty of the proposed approach is twofold: image compression is considered from the position of source coding with side information and, contrarily to the existing scenarios where the side information is given explicitly; the side information is created based on a deterministic approximation of the local image features. We consider an image in the overcomplete transform domain as a realization of a random source with a structured codebook of symbols where each symbol represents a particular edge shape. Due to the partial availability of the side information at both encoder and decoder, we treat our problem as a modification of the Berger-Flynn-Gray problem and investigate a possible gain over the solutions when side information is either unavailable or available at the decoder. Finally, the paper presents a practical image compression algorithm for facial images based on our concept that demonstrates the superior performance in the very-low-bit-rate regime. Copyright © 2006 Hindawi Publishing Corporation. All rights reserved. 1. INTRODUCTION The urgent demand of efficient image representation is recognized by the industry and research community. Its necessity is highly increased due to the novel requirements of many authentication documents such as passports, ID cards, and visas as well as recent extended functionalities of wireless communication devices. The document, ticket, or even entry pass personalization are often requested in many authentication or identification protocols. In most cases, classical compression techniques developed for generic applications are not suitable for these purposes. Wavelet-based [1, 2] lossy image compression techniques [3–6] have proved to be the most efficient from the ratedistortion point of view for the rate range of 0.2–1 bits per pixel (bpp). The superior performance of this class of algorithms is justified by both decorrelation and energy compaction properties of the wavelet transform and by the efficient adaptive both interband (zero trees [5]) and intraband (estimation quantization (EQ) [7, 8]) models that describe the data in the wavelet subbands. Recent results in waveletbased image compression show that some modest performance improvement (in terms of peak signal-to-noise ratio (PSNR) up to 0.3 dB) could be achieved either taking into account the nonorthogonality of the transform [9] or using more complex higher-order context models of wavelet coefficients [10]. During years, a standard benchmark database of images for wavelet-based compression algorithm evaluation was used. It includes several 512 × 512 grayscale test images (like Lena, Barbara, Goldhill) and the verification was performed for the rates 0.2–1 bpp. In some applications, which include person authentication data like photo images or fingerprint images, the operational conditions might be different. In this case, especially for strong compression (below 0.15 bpp), the resulting image quality of the state-of-the-art algorithms is not satisfactory enough (Figure 1). Therefore, for this kind of applications more advanced techniques are needed to satisfy the fidelity constrains. In this paper, we address the problem of classical waveletbased image compression enhancement by using side information within a framework of distributed coding of correlated sources. Recently, it was practically shown that it is possible to achieve a significant performance gain when the side information is available at the decoder, while the encoder has no access to the side information [11]. Using the side information from an auxiliary analog additive white Gaussian noise (AWGN) channel in the form of a noisy copy of the input image at the decoder, it was reported a PSNR enhancement in the range of 1–2 dB depending on the test image and the compression rate. It could be noted that the performance of this scheme strongly depends on the state of the auxiliary channel, which should be known in advance at the encoding stage. Moreover, it is assumed that the noisy copy 2 EURASIP Journal on Applied Signal Processing X i ∈ {1, 2, . . . , 2NRX } Encoder Decoder X Figure 3: Lossy source coding system without side information. (a) (b) (c) Figure 1: (a) 256 × 256 8-bit test image Slava. Results of compression with rate 0.071 bits per pixel (bpp) using (b) JPEG2000 standard software (PSNR is 25.09 dB) and (c) state-of-the-art EQ coder (PSNR is 26.36 dB). X {X, Y} p{x, y} Y Encoder X Encoder Y between the random variables X and Y , and the conditional mutual information between the random variables X and Y given the random variable Z, respectively. RX denotes the rate of communications for the random variable X. Calligraphic font X is used to indicate sets X ∈ X, and |X| indicates the cardinality of a set. R+ is used to represent the set of positive real numbers. NRX NRY Joint decoder Figure 2: Slepian-Wolf coding. of the original image should be directly available at the decoder. This situation is typical for the distributed coding in the remote sensing applications or can be simulated as in the case of analog and digital television simulcast [11]. In the case of single-source compression, the side information is not directly available at the decoder. The main goal of this paper consists in the development of a concept of single-source compression within a distributed coding framework using virtually created side information. This concept is based on the accurate approximation of a source data using a structured codebook, which is shared by the encoder and decoder, and the communication of the residual approximation term within the classical wavelet-based compression paradigm. The paper is organized as follows. In Section 2, fundamentals of source coding with side information are presented. In Section 3, an approach for single-source distributed lossy coding is introduced. A practical algorithm for a very-low-bit-rate compression of passport photo images is developed in Section 4. Section 5 contains the experimental results and Section 6 concludes the paper. Notation 1. Scalar random variables are denoted by capital letters X, bold capital letters X denote vector random variables, letters x and x are reserved to denote the realization of scalar and vector random variables, respectively. The superscript N is used to denote N-length vectors xN = x = {x1 , x2 , . . . , xN }, where the ith element is denoted as xi . X ∼ (...truncated)


This is a preview of a remote PDF: https://link.springer.com/content/pdf/10.1155%2FASP%2F2006%2F69042.pdf
Article home page: https://link.springer.com/article/10.1155/ASP/2006/69042

J. E. Vila-Forcén, S. Voloshynovskiy. Facial Image Compression Based on Structured Codebooks in Overcomplete Domain, 2006, pp. 069042, Volume 2006, Issue 1, DOI: 10.1155/ASP/2006/69042