Intuitionistic Fuzzy Laplacian Twin Support Vector Machine for Semi-supervised Classification (pdf)

Article PDF cannot be displayed. You can download it here:

https://link.springer.com/content/pdf/10.1007/s40305-021-00354-9.pdf

Intuitionistic Fuzzy Laplacian Twin Support Vector Machine for Semi-supervised Classification

Journal of the Operations Research Society of China https://doi.org/10.1007/s40305-021-00354-9 Intuitionistic Fuzzy Laplacian Twin Support Vector Machine for Semi-supervised Classification Jia-Bin Zhou1 · Yan-Qin Bai1 · Yan-Ru Guo1 · Hai-Xiang Lin2 Received: 4 November 2020 / Revised: 16 April 2021 / Accepted: 19 May 2021 © The Author(s) 2021 Abstract In general, data contain noises which come from faulty instruments, flawed measurements or faulty communication. Learning with data in the context of classification or regression is inevitably affected by noises in the data. In order to remove or greatly reduce the impact of noises, we introduce the ideas of fuzzy membership functions and the Laplacian twin support vector machine (Lap-TSVM). A formulation of the linear intuitionistic fuzzy Laplacian twin support vector machine (IFLap-TSVM) is presented. Moreover, we extend the linear IFLap-TSVM to the nonlinear case by kernel function. The proposed IFLap-TSVM resolves the negative impact of noises and outliers by using fuzzy membership functions and is a more accurate reasonable classifier by using the geometric distribution information of labeled data and unlabeled data based on manifold regularization. Experiments with constructed artificial datasets, several UCI benchmark datasets and MNIST dataset show that the IFLap-TSVM has better classification accuracy than other state-of-the-art twin support vector machine (TSVM), intuitionistic fuzzy twin support vector machine (IFTSVM) and Lap-TSVM. This work was supported by the National Natural Science Foundation of China (No.11771275). The second author thanks the partially support of Dutch Research Council (No.040.11.724). B Yan-Qin Bai Jia-Bin Zhou Yan-Ru Guo Hai-Xiang Lin 1 Department of Mathematics, Shanghai University, Shanghai 200444, China 2 Delft Institute of Applied Mathematics, Delft University of Technology, Delft 2600GA, The Netherlands 123 J.-B. Zhou et al. Keywords Twin support vector machine · Semi-supervised classification · Intuitionistic fuzzy · Manifold regularization · Noisy data Mathematics Subject Classification 68T99 · 90C20 1 Introduction Support vector machine (SVM) was proposedd in details by Vapnik et al. [1]. The goal of SVM was to find an optimal hyperplane to separate the labeled data points into two classes. Because of its excellent performance in text classification tasks [2], it soon became the mainstream technology of machine learning. At present, SVM and its variants have been successfully applied in many fields such as face recognition [3], financial dwastree prediction [4], regression [5], traffic flow prediction [6], medica [7] and more. Proximal support vector machine (PSVM) [8,9] was derived from SVM; it aimed to find two parallel hyperplanes so that each plane was closer to one of two classes and as far away from the other as possible. Furthermore, in order to simplify the constraints, the generalized eigenvalue proximal support vector machine (GEPSVM) [10] was proposed. The main idea of GEPSVM was to replace two parallel hyperplanes with two nonparallel ones. According to thwas concept, Jayadeva et al. [11] proposed a well-known twin support vector machine (TSVM). Unlike the large quadratic programming problem (QPP) considered by traditional SVM, TSVM solves a pair of relatively smaller QPPs. The constraints of each QPP are only related to the data points of each of the two classes. Therefore, TSVM not only keeps the advantages of SVM, but also trains four times faster than SVM. Based on TSVM, Shao et al. [12] proposedd an imbalanced weighted Lagrangian twin support vector machine (WLTSVM) for the imbalanced data classification. Other extensions and applications of TSVM can be found in [13,14]. Recently, the research of semi-supervwased learning (SSL) [15–17] has become a new hotspot in the field of machine learning. The main reason was that in many practical problems, labeled data are always scarce, but there are large amount of unlabeled data. SSL was to use these unlabeled data to asswast a small number of labeled data for learning, so as to improve the performance of classifier. Manifold regularization (MR) [18,19] was one of the frameworks of SSL. In the MR framework, there are two regularization terms. One controls the complexity of classifier in the Reproducing Kernel Hilbert Spaces (RKHS), and the other controls the complexity as measured by the geometry of the dwastribution. Following the MR framework, Qi et al. [20] proposedd a Laplacian twin support vector machine (Lap-TSVM), which was the first twin support vector machine applied in the SSL problem. Extensive experimental results show that Lap-TSVM has very good performance in semi-supervwased classification. Other extensions and applications of semi-supervwased twin support vector machine can be found in [21,22]. In general, data contain nowases which come from faulty instruments, flawed measurements or faulty communication. Learning with data in the context of classification or regression is inevitably affected by nowases in the data. If the training samples are mixed by nowases, both SVM and its variants are often unable to find an optimal 123 Intuitionistic Fuzzy Laplacian Twin Support Vector... hyperplane and subsequently have difficulty to obtain satwasfactory results. In order to solve such problem, fuzzy support vector machine (FSVM) [23] was proposedd. The idea of FSVM was to use a membership function for each training sample. And the introduction of membership function can effectively reduce the effects of nowases and outlier points and thus produce a robust classifier. Moreover, combining the TSVM with membership function can not only improve computational efficiency but also pursue robust performance. In recent years, intuitionwastic fuzzy twin support vector machine (IFTSVM) [24] has been proposedd which assigns a pair of membership and nonmembership functions to every training sample. These two functions help the IFTSVM to reduce the influence of nowases and identify support vectors from nowases. The same difficulty was also encountered by the current semi-supervwased twin support vector machine and its variants. When there are many nowases in the data, the classification results are very poor and unsatwasfactory. Ideally, we would like to determine which points are nowasy, and then either remove them or greatly lower their weight. Therefore, inspired by the ideas of IFTSVM, we assign a pair of membership functions to each labeled point, which reduces the influence of nowases on the classifier. And we introduce the ideas of fuzzy membership functions and the Lap-TSVM. In thwas paper, we proposed a novel intuitionwastic fuzzy Laplacian twin support vector machine (IFLap-TSVM) for a semi-supervwased classification problem. We use some constructed tests and several real datasets to evaluate the effectiveness of the IFLap-TSVM. The main advantages of our IFLap-TSVM are: (1) Membership an (...truncated)