Reliable Multi-Label Learning via Conformal Predictor and Random Forest for Syndrome Differentiation of Chronic Fatigue in Traditional Chinese Medicine

PLOS ONE, Dec 2019

Objective Chronic Fatigue (CF) still remains unclear about its etiology, pathophysiology, nomenclature and diagnostic criteria in the medical community. Traditional Chinese medicine (TCM) adopts a unique diagnostic method, namely ‘bian zheng lun zhi’ or syndrome differentiation, to diagnose the CF with a set of syndrome factors, which can be regarded as the Multi-Label Learning (MLL) problem in the machine learning literature. To obtain an effective and reliable diagnostic tool, we use Conformal Predictor (CP), Random Forest (RF) and Problem Transformation method (PT) for the syndrome differentiation of CF. Methods and Materials In this work, using PT method, CP-RF is extended to handle MLL problem. CP-RF applies RF to measure the confidence level (p-value) of each label being the true label, and then selects multiple labels whose p-values are larger than the pre-defined significance level as the region prediction. In this paper, we compare the proposed CP-RF with typical CP-NBC(Naïve Bayes Classifier), CP-KNN(K-Nearest Neighbors) and ML-KNN on CF dataset, which consists of 736 cases. Specifically, 95 symptoms are used to identify CF, and four syndrome factors are employed in the syndrome differentiation, including ‘spleen deficiency’, ‘heart deficiency’, ‘liver stagnation’ and ‘qi deficiency’. The Results CP-RF demonstrates an outstanding performance beyond CP-NBC, CP-KNN and ML-KNN under the general metrics of subset accuracy, hamming loss, one-error, coverage, ranking loss and average precision. Furthermore, the performance of CP-RF remains steady at the large scale of confidence levels from 80% to 100%, which indicates its robustness to the threshold determination. In addition, the confidence evaluation provided by CP is valid and well-calibrated. Conclusion CP-RF not only offers outstanding performance but also provides valid confidence evaluation for the CF syndrome differentiation. It would be well applicable to TCM practitioners and facilitate the utilities of objective, effective and reliable computer-based diagnosis tool.

A PDF file should load here. If you do not see its contents the file may be temporarily unavailable at the journal website or you do not have a PDF plug-in installed and enabled in your browser.

Alternatively, you can download the file locally and open with any standalone PDF reader:

https://journals.plos.org/plosone/article/file?id=10.1371/journal.pone.0099565&type=printable

Reliable Multi-Label Learning via Conformal Predictor and Random Forest for Syndrome Differentiation of Chronic Fatigue in Traditional Chinese Medicine

Hong Y (2014) Reliable Multi-Label Learning via Conformal Predictor and Random Forest for Syndrome Differentiation of Chronic Fatigue in Traditional Chinese Medicine. PLoS ONE 9(6): e99565. doi:10.1371/journal.pone.0099565 Reliable Multi-Label Learning via Conformal Predictor and Random Forest for Syndrome Differentiation of Chronic Fatigue in Traditional Chinese Medicine Huazhen Wang 0 Xin Liu 0 Bing Lv 0 Fan Yang 0 Yanzhu Hong 0 Yuan-Soon Ho, Taipei Medical University, Taiwan 0 1 College of Computer Science and Technology, Huaqiao University , Xiamen , China , 2 School of Information Science and Engineering, Xiamen University , Xiamen , China , 3 Department of traditional Chinese medicine, Xiamen University , Xiamen , China Objective: Chronic Fatigue (CF) still remains unclear about its etiology, pathophysiology, nomenclature and diagnostic criteria in the medical community. Traditional Chinese medicine (TCM) adopts a unique diagnostic method, namely 'bian zheng lun zhi' or syndrome differentiation, to diagnose the CF with a set of syndrome factors, which can be regarded as the Multi-Label Learning (MLL) problem in the machine learning literature. To obtain an effective and reliable diagnostic tool, we use Conformal Predictor (CP), Random Forest (RF) and Problem Transformation method (PT) for the syndrome differentiation of CF. Methods and Materials: In this work, using PT method, CP-RF is extended to handle MLL problem. CP-RF applies RF to measure the confidence level (p-value) of each label being the true label, and then selects multiple labels whose p-values are larger than the pre-defined significance level as the region prediction. In this paper, we compare the proposed CP-RF with typical CP-NBC(Nave Bayes Classifier), CP-KNN(K-Nearest Neighbors) and ML-KNN on CF dataset, which consists of 736 cases. Specifically, 95 symptoms are used to identify CF, and four syndrome factors are employed in the syndrome differentiation, including 'spleen deficiency', 'heart deficiency', 'liver stagnation' and 'qi deficiency'. The Results: CP-RF demonstrates an outstanding performance beyond CP-NBC, CP-KNN and ML-KNN under the general metrics of subset accuracy, hamming loss, one-error, coverage, ranking loss and average precision. Furthermore, the performance of CP-RF remains steady at the large scale of confidence levels from 80% to 100%, which indicates its robustness to the threshold determination. In addition, the confidence evaluation provided by CP is valid and wellcalibrated. Conclusion: CP-RF not only offers outstanding performance but also provides valid confidence evaluation for the CF syndrome differentiation. It would be well applicable to TCM practitioners and facilitate the utilities of objective, effective and reliable computer-based diagnosis tool. - Competing Interests: The authors have declared that no competing interests exist. Chronic Fatigue (CF) is a sub-health status, pathologically characterized by nonspecific extreme fatigue (including physical fatigue and mental fatigue) over six months [1]. In the past, CF is a widespread illness which prevails among the people who lives under a fast-paced and stressful life. Thus far, the etiology, pathophysiology, nomenclature and diagnostic criteria of CF are still underexplored in Western medicine [2,3]. Alternatively, Traditional Chinese Medicine (TCM) has provided an effective approach for personalized diagnosis and treatment of CF, and has paid increasing attention as a complementary medicine by the medical researchers [4,5]. Unfortunately, TCM diagnosis still causes skepticism and criticism because TCM practitioners diagnose the patient only based on their subjective observation, knowledge, and clinical experience, which lacks objective test and cannot be scientifically proven by clinical trials [6]. Under the circumstances, it is desired to establish an objective and standardized diagnosis system for CF in TCM. Recently, researchers have found that machine learning technologies are able to figure out the inherent mechanism of TCM diagnosis and provide corrective predictions for patients [7,8]. Therefore, a computer-aided system aiming at providing objective and reliable diagnosis is highly desired for the better understanding of the TCM diagnosis of chronic fatigue. Differing from the western medicine, TCM adopts a unique diagnostic method, namely bian zheng lun zhi or syndrome differentiation [911], to practically diagnose the CF. According to the theory of TCM, the syndrome or zheng is a comprehensive description of the pathology of a disease in the body. Actually, the syndrome consists of a set of syndrome factors. Each factor is defined in terms of the location and condition of the body. The term location in TCM is similar to that of the Western medicine, such as heart, liver, spleen, lung, kidney and stomach. However, the term condition in TCM is totally different from the Western medicine, which reflects the disharmony in the body, su (...truncated)


This is a preview of a remote PDF: https://journals.plos.org/plosone/article/file?id=10.1371/journal.pone.0099565&type=printable

Huazhen Wang, Xin Liu, Bing Lv, Fan Yang, Yanzhu Hong. Reliable Multi-Label Learning via Conformal Predictor and Random Forest for Syndrome Differentiation of Chronic Fatigue in Traditional Chinese Medicine, PLOS ONE, 2014, Volume 9, Issue 6, DOI: 10.1371/journal.pone.0099565