Classification by Principal Component Regression in the Real and Hypercomplex Domains (pdf)

Article PDF cannot be displayed. You can download it here:

https://link.springer.com/content/pdf/10.1007/s13369-022-07460-7.pdf

Classification by Principal Component Regression in the Real and Hypercomplex Domains

Arabian Journal for Science and Engineering https://doi.org/10.1007/s13369-022-07460-7 RESEARCH ARTICLE-COMPUTER ENGINEERING AND COMPUTER SCIENCE Classification by Principal Component Regression in the Real and Hypercomplex Domains Moumen T. El-Melegy1 · Aliaa T. Kamal2 · Khaled F. Hussain2 · H. M. El-Hawary3 Received: 22 April 2022 / Accepted: 16 October 2022 © The Author(s) 2022 Abstract Linear regression is a simple and widely used machine learning algorithm. It is a statistical approach for modeling the relationship between a scalar variable and one or more variables. In this paper, a classification by principal component regression (CbPCR) strategy is proposed. This strategy depends on performing regression of each data class in terms of its principal components. This CbPCR formulation leads to a new formulation of the Linear Regression Classification (LRC) problem that preserves the key information of the data classes while providing more compact closed-form solutions. For the sake of image classification, this strategy is also extended to the 4D hypercomplex domains to take into account the color information of the image. Quaternion and reduced biquaternion CbPCR strategies are proposed by representing each channel of the color image as one of the imaginary parts of a quaternion or reduced biquaternion number. Experiments on two color face recognition benchmark databases show that the proposed methods achieve better accuracies by a margin of about 3% over the original LRC and like methods. Keywords Linear regression classification · Principal component analysis · Hypercomplex numbers · Face recognition 1 Introduction Linear regression is a simple and widely used machine learning algorithm that has received a lot of attention in many fields. In the image recognition area, Naseem et al. [1] proposed a Linear Regression Classification (LRC) algorithm that represents each class’s training images independently assuming a linear regression relationship. The algorithm depends on applying the least squares method to find the regression coefficient then decides the class label that gives the smallest reconstruction error. To enhance its performance, Huang and Yang [2] and Zhu et al. [3] proposed to apply principal component analysis (PCA) [4] to extract the vital information from images and reduce the feature vector dimensions. Then, the original data are transformed into B Moumen T. El-Melegy 1 Electrical Engineering Department, Faculty of Engineering, Assiut University, Assiut, Egypt 2 Computer Science Department, Faculty of Computers and Information, Assiut University, Assiut, Egypt 3 Mathematics Department, Faculty of Science, Assiut University, Assiut, Egypt a low-dimensional subspace. Finally, LRC is performed on the projected data. This paper contributes to this literature by proposing a new strategy for classification by performing regression of data in terms of its principal components. A novel formulation of the LRC problem, called Classification by Principal Component Regression (CbPCR), is presented. Moreover, a novel closed-form solution based on this formulation is derived. This classification strategy preserves key data class information and removes redundant and correlated details, yet yielding a more compact solution. Several experiments on public face recognition benchmark databases are reported to provide evidence that the proposed strategy outperforms the original LRC method [1] and its recent variants [2, 3]. The proposed strategy is also extended to color images. PCA techniques [4–6] and the existing methods [1–3] work in principle on grayscale, single-channel images. They may operate on color images after converting them to grayscale images, thus losing the important color information. Some methods (e.g., [7]) apply LRC to every color channel separately, then select the class having the smallest total prediction error over all color channels. Unlike those methods, inspired by several studies [8–14], the current paper proposes to use 4D hypercomplex numbers to represent color images. 123 Arabian Journal for Science and Engineering This allows treating the color components of each image pixel as one entity thus considering the correlation between color components. Among the studies [8–14], two address the LRC problem. Zou et al. [11] proposes a quaternion LRC (QLRC) method that extends the classical LRC algorithm to quaternion space. QLRC converts the quaternion quantities to real ones to circumvent using quaternion derivatives. The recent paper [14] develops closed-form solutions for QLRC from the principles of quaternion calculus. In addition, the current paper proposes novel solutions based on reduced biquaternions (RBs), another hypercomplex space consisting of one real component and 3 imaginary ones. In addition to having commutative algebra—in contrast to quaternions—RBs may be represented using the so-called e1 -e2 form [15] that can lead to more time-efficient computation. The proposed CbPCR formulation is extended to both the quaternion and RB domains to process color images. To that end, the current paper exploits an efficient algorithm derived by the authors in [10] for computing the principal components (eigenvectors) of an RB matrix by casting it as an x + y selection problem [16, 17]. The experimental results on public benchmark databases for color face recognition demonstrate the better performances of the new quaternion and RB-based CbPCR algorithms over competing algorithms [7, 11, 14]. The rest of this paper is organized as follows: Sect. 2 gives a brief history of using 4D hypercomplex domains, namely quaternion and reduced biquaternions, in color image processing, with focus on color face recognition. Section 3 gives some notations and formally defines the problem of our concern here. Section 4 briefly reviews the quaternion and RB domains. Section 5 describes the proposed CbPCR method and its extension to the quaternion and the RB domains. The classification results on two benchmark color face databases are reported in Sect. 6. Section 7 concludes the paper. 2 Related Work This section briefly reviews the use of 4D hypercomplex domains, namely quaternion, and reduced biquaternions, in representing color image, with focus on their application to color face recognition. In 1996, Sangwine [18] introduced the idea of using 4D hypercomplex numbers (Quaternion numbers) in color image processing by encoding the pixel’s color components into the three imaginary parts of a quaternion number. Bihan and Sangwine [8] and Pei et al. [19] proposed a Quaternion PCA method, which extracts more informative and robust features from the color image than conventional PCA. In 2011, Sun et al. [20] proposed 2DPCA and bi-dimensional PCA (BDPCA) based on quaternion representation. Also, 123 Javier et al. [21] proposed an independent component analysis algorithm based on quaternions. Years later, Jia et al. [22] presented a 2DPCA (...truncated)