A Hybrid Classification System for Heart Disease Diagnosis Based on the RFRS Method

Computational and Mathematical Methods in Medicine, Jan 2017

Heart disease is one of the most common diseases in the world. The objective of this study is to aid the diagnosis of heart disease using a hybrid classification system based on the ReliefF and Rough Set (RFRS) method. The proposed system contains two subsystems: the RFRS feature selection system and a classification system with an ensemble classifier. The first system includes three stages: (i) data discretization, (ii) feature extraction using the ReliefF algorithm, and (iii) feature reduction using the heuristic Rough Set reduction algorithm that we developed. In the second system, an ensemble classifier is proposed based on the C4.5 classifier. The Statlog (Heart) dataset, obtained from the UCI database, was used for experiments. A maximum classification accuracy of 92.59% was achieved according to a jackknife cross-validation scheme. The results demonstrate that the performance of the proposed system is superior to the performances of previously reported classification techniques.

A PDF file should load here. If you do not see its contents the file may be temporarily unavailable at the journal website or you do not have a PDF plug-in installed and enabled in your browser.

Alternatively, you can download the file locally and open with any standalone PDF reader:

http://downloads.hindawi.com/journals/cmmm/2017/8272091.pdf

A Hybrid Classification System for Heart Disease Diagnosis Based on the RFRS Method

A Hybrid Classification System for Heart Disease Diagnosis Based on the RFRS Method Xiao Liu,1 Xiaoli Wang,1 Qiang Su,1 Mo Zhang,2 Yanhong Zhu,3 Qiugen Wang,4 and Qian Wang4 1School of Economics and Management, Tongji University, Shanghai, China 2School of Economics and Management, Shanghai Maritime University, Shanghai, China 3Department of Scientific Research, Shanghai General Hospital, School of Medicine, Shanghai Jiaotong University, Shanghai, China 4Trauma Center, Shanghai General Hospital, School of Medicine, Shanghai Jiaotong University, Shanghai, China Correspondence should be addressed to Xiaoli Wang; nc.ude.ijgnot@gnaw-iloaix Received 19 March 2016; Revised 12 July 2016; Accepted 1 August 2016; Published 3 January 2017 Academic Editor: Xiaopeng Zhao Copyright © 2017 Xiao Liu et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. Abstract Heart disease is one of the most common diseases in the world. The objective of this study is to aid the diagnosis of heart disease using a hybrid classification system based on the ReliefF and Rough Set (RFRS) method. The proposed system contains two subsystems: the RFRS feature selection system and a classification system with an ensemble classifier. The first system includes three stages: (i) data discretization, (ii) feature extraction using the ReliefF algorithm, and (iii) feature reduction using the heuristic Rough Set reduction algorithm that we developed. In the second system, an ensemble classifier is proposed based on the C4.5 classifier. The Statlog (Heart) dataset, obtained from the UCI database, was used for experiments. A maximum classification accuracy of 92.59% was achieved according to a jackknife cross-validation scheme. The results demonstrate that the performance of the proposed system is superior to the performances of previously reported classification techniques. 1. Introduction Cardiovascular disease (CVD) is a primary cause of death. An estimated 17.5 million people died from CVD in 2012, representing 31% of all global deaths (http://www.who.int/mediacentre/factsheets/fs317/en/). In the United States, heart disease kills one person every 34 seconds [1]. Numerous factors are involved in the diagnosis of heart disease, which complicates a physician’s task. To help physicians make quick decisions and minimize errors in diagnosis, classification systems enable physicians to rapidly examine medical data in considerable detail [2]. These systems are implemented by developing a model that can classify existing records using sample data. Various classification algorithms have been developed and used as classifiers to assist doctors in diagnosing heart disease patients. The performances obtained using the Statlog (Heart) dataset [3] from the UCI machine learning database are compared in this context. Lee [4] proposed a novel supervised feature selection method based on the bounded sum of weighted fuzzy membership functions (BSWFM) and Euclidean distances and obtained an accuracy of 87.4%. Tomar and Agarwal [5] used the F-score feature selection method and the Least Square Twin Support Vector Machine (LSTSVM) to diagnose heart diseases, obtaining an average classification accuracy of 85.59%. Buscema et al. [6] used the Training with Input Selection and Testing (TWIST) algorithm to classify patterns, obtaining an accuracy of 84.14%. The Extreme Learning Machine (ELM) has also been used as a classifier, obtaining a reported classification accuracy of 87.5% [7]. The genetic algorithm with the Naïve Bayes classifier has been shown to have a classification accuracy of 85.87% [8]. Srinivas et al. [9] obtained an 83.70% classification accuracy using Naïve Bayes. Polat and Güneş [10] used the RBF kernel F-score feature selection method to detect heart disease. The LS-SVM classifier was used, obtaining a classification accuracy of 83.70%. In [11], the GA-AWAIS method was used for heart disease detection, with a classification accuracy of 87.43%. The Algebraic Sigmoid Method has also been proposed to classify heart disease, with a reported accuracy of 85.24% [12]. Wang et al. [13] used linear kernel SVM classifiers for heart disease detection and obtained an accuracy of 83.37%. In [14], three distance criteria were applied in simple AIS, and the accuracy obtained on the Statlog (Heart) dataset was 83.95%. In [15], a hybrid neural network method was proposed, and the reported accuracy was 86.8%. Yan et al. [16] achieved an 83.75% classification accuracy using ICA and SVM classifiers. Şahan et al. [17] proposed a new artificial immune system named the Attribute Weighted Artificial Immune System (AWAIS) and obtained an accuracy of 82.59% using the k-fold cross-validation method. In [18], the k-NN, k-NN with Manhattan, feature space mapping (FSM), and separability split value (SSV) algorit (...truncated)


This is a preview of a remote PDF: http://downloads.hindawi.com/journals/cmmm/2017/8272091.pdf

Xiao Liu, Xiaoli Wang, Qiang Su, Mo Zhang, Yanhong Zhu, Qiugen Wang, Qian Wang. A Hybrid Classification System for Heart Disease Diagnosis Based on the RFRS Method, Computational and Mathematical Methods in Medicine, 2017, 2017, DOI: 10.1155/2017/8272091