Reliable Multi-Label Learning via Conformal Predictor and Random Forest for Syndrome Differentiation of Chronic Fatigue in Traditional Chinese Medicine
Hong Y (2014) Reliable Multi-Label Learning via Conformal Predictor and Random Forest for Syndrome Differentiation of
Chronic Fatigue in Traditional Chinese Medicine. PLoS ONE 9(6): e99565. doi:10.1371/journal.pone.0099565
Reliable Multi-Label Learning via Conformal Predictor and Random Forest for Syndrome Differentiation of Chronic Fatigue in Traditional Chinese Medicine
Huazhen Wang 0
Xin Liu 0
Bing Lv 0
Fan Yang 0
Yanzhu Hong 0
Yuan-Soon Ho, Taipei Medical University, Taiwan
0 1 College of Computer Science and Technology, Huaqiao University , Xiamen , China , 2 School of Information Science and Engineering, Xiamen University , Xiamen , China , 3 Department of traditional Chinese medicine, Xiamen University , Xiamen , China
Objective: Chronic Fatigue (CF) still remains unclear about its etiology, pathophysiology, nomenclature and diagnostic criteria in the medical community. Traditional Chinese medicine (TCM) adopts a unique diagnostic method, namely 'bian zheng lun zhi' or syndrome differentiation, to diagnose the CF with a set of syndrome factors, which can be regarded as the Multi-Label Learning (MLL) problem in the machine learning literature. To obtain an effective and reliable diagnostic tool, we use Conformal Predictor (CP), Random Forest (RF) and Problem Transformation method (PT) for the syndrome differentiation of CF. Methods and Materials: In this work, using PT method, CP-RF is extended to handle MLL problem. CP-RF applies RF to measure the confidence level (p-value) of each label being the true label, and then selects multiple labels whose p-values are larger than the pre-defined significance level as the region prediction. In this paper, we compare the proposed CP-RF with typical CP-NBC(Nave Bayes Classifier), CP-KNN(K-Nearest Neighbors) and ML-KNN on CF dataset, which consists of 736 cases. Specifically, 95 symptoms are used to identify CF, and four syndrome factors are employed in the syndrome differentiation, including 'spleen deficiency', 'heart deficiency', 'liver stagnation' and 'qi deficiency'. The Results: CP-RF demonstrates an outstanding performance beyond CP-NBC, CP-KNN and ML-KNN under the general metrics of subset accuracy, hamming loss, one-error, coverage, ranking loss and average precision. Furthermore, the performance of CP-RF remains steady at the large scale of confidence levels from 80% to 100%, which indicates its robustness to the threshold determination. In addition, the confidence evaluation provided by CP is valid and wellcalibrated. Conclusion: CP-RF not only offers outstanding performance but also provides valid confidence evaluation for the CF syndrome differentiation. It would be well applicable to TCM practitioners and facilitate the utilities of objective, effective and reliable computer-based diagnosis tool.
-
Competing Interests: The authors have declared that no competing interests exist.
Chronic Fatigue (CF) is a sub-health status, pathologically
characterized by nonspecific extreme fatigue (including physical
fatigue and mental fatigue) over six months [1]. In the past, CF is a
widespread illness which prevails among the people who lives
under a fast-paced and stressful life. Thus far, the etiology,
pathophysiology, nomenclature and diagnostic criteria of CF are
still underexplored in Western medicine [2,3]. Alternatively,
Traditional Chinese Medicine (TCM) has provided an effective
approach for personalized diagnosis and treatment of CF, and has
paid increasing attention as a complementary medicine by the
medical researchers [4,5]. Unfortunately, TCM diagnosis still
causes skepticism and criticism because TCM practitioners
diagnose the patient only based on their subjective observation,
knowledge, and clinical experience, which lacks objective test and
cannot be scientifically proven by clinical trials [6]. Under the
circumstances, it is desired to establish an objective and
standardized diagnosis system for CF in TCM. Recently,
researchers have found that machine learning technologies are
able to figure out the inherent mechanism of TCM diagnosis and
provide corrective predictions for patients [7,8]. Therefore, a
computer-aided system aiming at providing objective and reliable
diagnosis is highly desired for the better understanding of the
TCM diagnosis of chronic fatigue.
Differing from the western medicine, TCM adopts a unique
diagnostic method, namely bian zheng lun zhi or syndrome
differentiation [911], to practically diagnose the CF. According to
the theory of TCM, the syndrome or zheng is a comprehensive
description of the pathology of a disease in the body. Actually, the
syndrome consists of a set of syndrome factors. Each factor is
defined in terms of the location and condition of the body. The term
location in TCM is similar to that of the Western medicine, such as
heart, liver, spleen, lung, kidney and stomach. However, the term
condition in TCM is totally different from the Western medicine,
which reflects the disharmony in the body, su (...truncated)