Classification by Principal Component Regression in the Real and Hypercomplex Domains
Arabian Journal for Science and Engineering
https://doi.org/10.1007/s13369-022-07460-7
RESEARCH ARTICLE-COMPUTER ENGINEERING AND COMPUTER SCIENCE
Classification by Principal Component Regression in the Real
and Hypercomplex Domains
Moumen T. El-Melegy1
· Aliaa T. Kamal2 · Khaled F. Hussain2 · H. M. El-Hawary3
Received: 22 April 2022 / Accepted: 16 October 2022
© The Author(s) 2022
Abstract
Linear regression is a simple and widely used machine learning algorithm. It is a statistical approach for modeling the
relationship between a scalar variable and one or more variables. In this paper, a classification by principal component
regression (CbPCR) strategy is proposed. This strategy depends on performing regression of each data class in terms of its
principal components. This CbPCR formulation leads to a new formulation of the Linear Regression Classification (LRC)
problem that preserves the key information of the data classes while providing more compact closed-form solutions. For the
sake of image classification, this strategy is also extended to the 4D hypercomplex domains to take into account the color
information of the image. Quaternion and reduced biquaternion CbPCR strategies are proposed by representing each channel
of the color image as one of the imaginary parts of a quaternion or reduced biquaternion number. Experiments on two color
face recognition benchmark databases show that the proposed methods achieve better accuracies by a margin of about 3%
over the original LRC and like methods.
Keywords Linear regression classification · Principal component analysis · Hypercomplex numbers · Face recognition
1 Introduction
Linear regression is a simple and widely used machine learning algorithm that has received a lot of attention in many
fields. In the image recognition area, Naseem et al. [1] proposed a Linear Regression Classification (LRC) algorithm
that represents each class’s training images independently
assuming a linear regression relationship. The algorithm
depends on applying the least squares method to find the
regression coefficient then decides the class label that gives
the smallest reconstruction error. To enhance its performance, Huang and Yang [2] and Zhu et al. [3] proposed to
apply principal component analysis (PCA) [4] to extract the
vital information from images and reduce the feature vector dimensions. Then, the original data are transformed into
B
Moumen T. El-Melegy
1
Electrical Engineering Department, Faculty of Engineering,
Assiut University, Assiut, Egypt
2
Computer Science Department, Faculty of Computers and
Information, Assiut University, Assiut, Egypt
3
Mathematics Department, Faculty of Science, Assiut
University, Assiut, Egypt
a low-dimensional subspace. Finally, LRC is performed on
the projected data.
This paper contributes to this literature by proposing a
new strategy for classification by performing regression of
data in terms of its principal components. A novel formulation of the LRC problem, called Classification by Principal
Component Regression (CbPCR), is presented. Moreover,
a novel closed-form solution based on this formulation is
derived. This classification strategy preserves key data class
information and removes redundant and correlated details,
yet yielding a more compact solution. Several experiments
on public face recognition benchmark databases are reported
to provide evidence that the proposed strategy outperforms
the original LRC method [1] and its recent variants [2, 3].
The proposed strategy is also extended to color images.
PCA techniques [4–6] and the existing methods [1–3] work
in principle on grayscale, single-channel images. They may
operate on color images after converting them to grayscale
images, thus losing the important color information. Some
methods (e.g., [7]) apply LRC to every color channel
separately, then select the class having the smallest total prediction error over all color channels. Unlike those methods,
inspired by several studies [8–14], the current paper proposes
to use 4D hypercomplex numbers to represent color images.
123
Arabian Journal for Science and Engineering
This allows treating the color components of each image pixel
as one entity thus considering the correlation between color
components. Among the studies [8–14], two address the LRC
problem. Zou et al. [11] proposes a quaternion LRC (QLRC)
method that extends the classical LRC algorithm to quaternion space. QLRC converts the quaternion quantities to real
ones to circumvent using quaternion derivatives. The recent
paper [14] develops closed-form solutions for QLRC from
the principles of quaternion calculus. In addition, the current
paper proposes novel solutions based on reduced biquaternions (RBs), another hypercomplex space consisting of one
real component and 3 imaginary ones. In addition to having
commutative algebra—in contrast to quaternions—RBs may
be represented using the so-called e1 -e2 form [15] that can
lead to more time-efficient computation.
The proposed CbPCR formulation is extended to both the
quaternion and RB domains to process color images. To that
end, the current paper exploits an efficient algorithm derived
by the authors in [10] for computing the principal components (eigenvectors) of an RB matrix by casting it as an x
+ y selection problem [16, 17]. The experimental results
on public benchmark databases for color face recognition
demonstrate the better performances of the new quaternion
and RB-based CbPCR algorithms over competing algorithms
[7, 11, 14].
The rest of this paper is organized as follows: Sect. 2 gives
a brief history of using 4D hypercomplex domains, namely
quaternion and reduced biquaternions, in color image processing, with focus on color face recognition. Section 3 gives
some notations and formally defines the problem of our concern here. Section 4 briefly reviews the quaternion and RB
domains. Section 5 describes the proposed CbPCR method
and its extension to the quaternion and the RB domains. The
classification results on two benchmark color face databases
are reported in Sect. 6. Section 7 concludes the paper.
2 Related Work
This section briefly reviews the use of 4D hypercomplex
domains, namely quaternion, and reduced biquaternions, in
representing color image, with focus on their application to
color face recognition.
In 1996, Sangwine [18] introduced the idea of using 4D
hypercomplex numbers (Quaternion numbers) in color image
processing by encoding the pixel’s color components into
the three imaginary parts of a quaternion number. Bihan
and Sangwine [8] and Pei et al. [19] proposed a Quaternion
PCA method, which extracts more informative and robust
features from the color image than conventional PCA. In
2011, Sun et al. [20] proposed 2DPCA and bi-dimensional
PCA (BDPCA) based on quaternion representation. Also,
123
Javier et al. [21] proposed an independent component analysis algorithm based on quaternions. Years later, Jia et al. [22]
presented a 2DPCA (...truncated)