Intuitionistic Fuzzy Laplacian Twin Support Vector Machine for Semi-supervised Classification
Journal of the Operations Research Society of China
https://doi.org/10.1007/s40305-021-00354-9
Intuitionistic Fuzzy Laplacian Twin Support Vector Machine
for Semi-supervised Classification
Jia-Bin Zhou1 · Yan-Qin Bai1
· Yan-Ru Guo1 · Hai-Xiang Lin2
Received: 4 November 2020 / Revised: 16 April 2021 / Accepted: 19 May 2021
© The Author(s) 2021
Abstract
In general, data contain noises which come from faulty instruments, flawed measurements or faulty communication. Learning with data in the context of classification or
regression is inevitably affected by noises in the data. In order to remove or greatly
reduce the impact of noises, we introduce the ideas of fuzzy membership functions
and the Laplacian twin support vector machine (Lap-TSVM). A formulation of the
linear intuitionistic fuzzy Laplacian twin support vector machine (IFLap-TSVM) is
presented. Moreover, we extend the linear IFLap-TSVM to the nonlinear case by kernel function. The proposed IFLap-TSVM resolves the negative impact of noises and
outliers by using fuzzy membership functions and is a more accurate reasonable classifier by using the geometric distribution information of labeled data and unlabeled data
based on manifold regularization. Experiments with constructed artificial datasets,
several UCI benchmark datasets and MNIST dataset show that the IFLap-TSVM has
better classification accuracy than other state-of-the-art twin support vector machine
(TSVM), intuitionistic fuzzy twin support vector machine (IFTSVM) and Lap-TSVM.
This work was supported by the National Natural Science Foundation of China (No.11771275). The
second author thanks the partially support of Dutch Research Council (No.040.11.724).
B Yan-Qin Bai
Jia-Bin Zhou
Yan-Ru Guo
Hai-Xiang Lin
1
Department of Mathematics, Shanghai University, Shanghai 200444, China
2
Delft Institute of Applied Mathematics, Delft University of Technology, Delft 2600GA, The
Netherlands
123
J.-B. Zhou et al.
Keywords Twin support vector machine · Semi-supervised classification ·
Intuitionistic fuzzy · Manifold regularization · Noisy data
Mathematics Subject Classification 68T99 · 90C20
1 Introduction
Support vector machine (SVM) was proposedd in details by Vapnik et al. [1]. The
goal of SVM was to find an optimal hyperplane to separate the labeled data points
into two classes. Because of its excellent performance in text classification tasks [2],
it soon became the mainstream technology of machine learning. At present, SVM and
its variants have been successfully applied in many fields such as face recognition [3],
financial dwastree prediction [4], regression [5], traffic flow prediction [6], medica [7]
and more. Proximal support vector machine (PSVM) [8,9] was derived from SVM;
it aimed to find two parallel hyperplanes so that each plane was closer to one of
two classes and as far away from the other as possible. Furthermore, in order to
simplify the constraints, the generalized eigenvalue proximal support vector machine
(GEPSVM) [10] was proposed. The main idea of GEPSVM was to replace two parallel
hyperplanes with two nonparallel ones. According to thwas concept, Jayadeva et al.
[11] proposed a well-known twin support vector machine (TSVM). Unlike the large
quadratic programming problem (QPP) considered by traditional SVM, TSVM solves
a pair of relatively smaller QPPs. The constraints of each QPP are only related to the
data points of each of the two classes. Therefore, TSVM not only keeps the advantages
of SVM, but also trains four times faster than SVM. Based on TSVM, Shao et al.
[12] proposedd an imbalanced weighted Lagrangian twin support vector machine
(WLTSVM) for the imbalanced data classification. Other extensions and applications
of TSVM can be found in [13,14].
Recently, the research of semi-supervwased learning (SSL) [15–17] has become a
new hotspot in the field of machine learning. The main reason was that in many practical problems, labeled data are always scarce, but there are large amount of unlabeled
data. SSL was to use these unlabeled data to asswast a small number of labeled data
for learning, so as to improve the performance of classifier. Manifold regularization
(MR) [18,19] was one of the frameworks of SSL. In the MR framework, there are two
regularization terms. One controls the complexity of classifier in the Reproducing Kernel Hilbert Spaces (RKHS), and the other controls the complexity as measured by the
geometry of the dwastribution. Following the MR framework, Qi et al. [20] proposedd
a Laplacian twin support vector machine (Lap-TSVM), which was the first twin support vector machine applied in the SSL problem. Extensive experimental results show
that Lap-TSVM has very good performance in semi-supervwased classification. Other
extensions and applications of semi-supervwased twin support vector machine can be
found in [21,22].
In general, data contain nowases which come from faulty instruments, flawed measurements or faulty communication. Learning with data in the context of classification
or regression is inevitably affected by nowases in the data. If the training samples are
mixed by nowases, both SVM and its variants are often unable to find an optimal
123
Intuitionistic Fuzzy Laplacian Twin Support Vector...
hyperplane and subsequently have difficulty to obtain satwasfactory results. In order
to solve such problem, fuzzy support vector machine (FSVM) [23] was proposedd.
The idea of FSVM was to use a membership function for each training sample. And the
introduction of membership function can effectively reduce the effects of nowases and
outlier points and thus produce a robust classifier. Moreover, combining the TSVM
with membership function can not only improve computational efficiency but also
pursue robust performance. In recent years, intuitionwastic fuzzy twin support vector
machine (IFTSVM) [24] has been proposedd which assigns a pair of membership
and nonmembership functions to every training sample. These two functions help
the IFTSVM to reduce the influence of nowases and identify support vectors from
nowases.
The same difficulty was also encountered by the current semi-supervwased twin
support vector machine and its variants. When there are many nowases in the data,
the classification results are very poor and unsatwasfactory. Ideally, we would like to
determine which points are nowasy, and then either remove them or greatly lower their
weight. Therefore, inspired by the ideas of IFTSVM, we assign a pair of membership
functions to each labeled point, which reduces the influence of nowases on the classifier. And we introduce the ideas of fuzzy membership functions and the Lap-TSVM.
In thwas paper, we proposed a novel intuitionwastic fuzzy Laplacian twin support
vector machine (IFLap-TSVM) for a semi-supervwased classification problem. We
use some constructed tests and several real datasets to evaluate the effectiveness of the
IFLap-TSVM. The main advantages of our IFLap-TSVM are:
(1) Membership an (...truncated)