Kernel Affine Projection Algorithms

EURASIP Journal on Advances in Signal Processing, Mar 2008

The combination of the famed kernel trick and affine projection algorithms (APAs) yields powerful nonlinear extensions, named collectively here, KAPA. This paper is a follow-up study of the recently introduced kernel least-mean-square algorithm (KLMS). KAPA inherits the simplicity and online nature of KLMS while reducing its gradient noise, boosting performance. More interestingly, it provides a unifying model for several neural network techniques, including kernel least-mean-square algorithms, kernel adaline, sliding-window kernel recursive-least squares (KRLS), and regularization networks. Therefore, many insights can be gained into the basic relations among them and the tradeoff between computation complexity and performance. Several simulations illustrate its wide applicability.

A PDF file should load here. If you do not see its contents the file may be temporarily unavailable at the journal website or you do not have a PDF plug-in installed and enabled in your browser.

Alternatively, you can download the file locally and open with any standalone PDF reader:

https://link.springer.com/content/pdf/10.1155%2F2008%2F784292.pdf

Kernel Affine Projection Algorithms

EURASIP Journal on Advances in Signal Processing Hindawi Publishing Corporation Kernel Affine Projection Algorithms Weifeng Liu 0 Jos e´ C. Pr´ıncipe 0 Recommended by An´ıbal Figueiras-Vidal 0 Department of Electrical and Computer Engineering, University of Florida , Gainesville, FL 32611 , USA The combination of the famed kernel trick and affine projection algorithms (APAs) yields powerful nonlinear extensions, named collectively here, KAPA. This paper is a follow-up study of the recently introduced kernel least-mean-square algorithm (KLMS). KAPA inherits the simplicity and online nature of KLMS while reducing its gradient noise, boosting performance. More interestingly, it provides a unifying model for several neural network techniques, including kernel least-mean-square algorithms, kernel adaline, sliding-window kernel recursive-least squares (KRLS), and regularization networks. Therefore, many insights can be gained into the basic relations among them and the tradeoff between computation complexity and performance. Several simulations illustrate its wide applicability. 1. INTRODUCTION The solid mathematical foundation, wide and successful applications are making kernel methods very popular. By the famed kernel trick, many linear methods have been recast in high dimensional reproducing kernel Hilbert spaces (RKHS) to yield more powerful nonlinear extensions, including support vector machines [1], principal component analysis [2], recursive least squares [3], Hebbian algorithm [4], Adaline [5], and so forth. More recently, a kernelized least-mean-square (KLMS) algorithm was proposed in [6], which implicitly creates a growing radial basis function network (RBF) with a learning strategy similar to resource-allocating networks (RAN) proposed by Platt [7]. As an improvement, kernelized affine projection algorithms (KAPAs) are presented for the first time in this paper by reformulating the conventional affine projection algorithm (APA) [8] in general reproducing kernel Hilbert spaces (RKHS). The new algorithms are online, simple, and significantly reduce the gradient noise compared with the KLMS and thus improve performance. More interestingly, the KAPA reduces to the kernel least-mean square (KLMS), sliding-window kernel recursive least squares (SW-KRLS), kernel adaline, and regularization networks naturally in special cases. Thus it provides a unifying model for these existing methods and helps better understand the basic relations among them and the tradeoff between complexity and performance. Moreover, it also advances our understanding on the resource-allocating networks. Exploiting the underlying linear structure of RKHS, a brief discussion on its well-posedness will be conducted. The organization of the paper is as follows. In Section 2, the affine projection algorithms are briefly reviewed. Next, in Section 3, the kernel trick is applied to formulate the nonlinear affine projection algorithms. Other related algorithms are reviewed as special cases of the KAPA in Section 4. We detail the implementation of the KAPA in Section 5. Three experiments are studied in Section 6 to support our theory. Finally, Section 7 summarizes the conclusions and future lines of research. The notation used throughout the paper is summarized in Table 1. 2. A REVIEW OF THE AFFINE PROJECTION ALGORITHMS Let d be a zero-mean scalar-valued random variable, and let u be a zero-mean L × 1 random variable with a positive-definite covariance matrix Ru = E uuT . The crosscovariance vector of d and u is denoted by rdu = E du . The weight vector w that solves min E d − wT u 2 w ( 1 ) is given by wo = Ru−1rdu [8]. Several methods that approximate w iteratively also exist, for example, the common gradient method ( 2 ) ( 3 ) w(0) = initial guess; w(i) = w(i − 1) + η rdu − Ruw(i − 1) , or the regularized Newton’s recursion w(0) = initial guess; w(i) = w(i − 1) + η Ru + εI −1 rdu − Ruw(i − 1) , where ε is a small positive regularization factor and η is the step size specified by the designer. Stochastic-gradient algorithms replace the covariance matrix and the cross-covariance vector by local approximations directly from data at each iteration. There are several ways for obtaining such approximations. The tradeoff is computation complexity, convergence performance, and steady-state behavior [8]. Assume that we have access to observations of the random variables d and u over time d( 1 ), d( 2 ), . . . , u( 1 ), u( 2 ), . . . . ( 4 ) The Least-mean-square (LMS) algorithm simply uses the instantaneous values for approximations Ru = u(i)u(i)T and rdu = d(i)u(i). The corresponding steepest-descent recursion ( 2 ) and Newton’s recursion ( 3 ) become w(i) = w(i − 1) + ηu(i) d(i) − u(i)T w(i − 1) ; w(i) = w(i−1)+ηu(i) u(i)T u(i)+εI −1 d(i)−u(i)T w(i−1) . ( 5 ) The affine projection algorithm however employs better approximations. Specifically, Ru and rdu are replaced by the instantaneous approximations from the (...truncated)


This is a preview of a remote PDF: https://link.springer.com/content/pdf/10.1155%2F2008%2F784292.pdf

Weifeng Liu, José C. Príncipe. Kernel Affine Projection Algorithms, EURASIP Journal on Advances in Signal Processing, 2008, pp. 784292, Volume 2008, Issue 1, DOI: 10.1155/2008/784292