ElGamal Homomorphic Encryption-Based Privacy Preserving Association Rule Mining on Horizontally Partitioned Healthcare Data

Journal of The Institution of Engineers (India): Series B, Jan 2022

In today’s world, life-threatening diseases have become a pre-eminent issue in healthcare due to the higher mortality rate. It is possible to lower this mortality rate by utilizing healthcare intelligence to detect diseases early. Patient’s medical data is stored in the EHR system, which is kept up to date by the healthcare provider. Data mining techniques like Association Rule Mining can detect a patient’s disease from their symptoms using digital healthcare data stored in the EHR system. Association rule mining’s efficacy can be improved by using global data from various EHR systems. It mandates that all EHR systems exchange healthcare records to a central server. When personal health information is made available on an untrusted server, several privacy laws may be violated. As a result, the challenge of privacy preserving distributed healthcare data mining has become a well-known study field in the healthcare industry. This research uses an efficient ElGamal homomorphic encryption technique to protect privacy in a distributed association rule mining. The proposed approach to discover the risk factor of most life-threatening diseases like breast cancer and heart disease with its symptoms and discuss the scope for combating COVID-19. Theoretical analysis of the proposed approach shows that it is efficient and maintains privacy in an insecure communication environment. An experimental study with a real dataset shows the proposed approach’s benefit compared to the local single EHR system results.

Article PDF cannot be displayed. You can download it here:

https://link.springer.com/content/pdf/10.1007/s40031-021-00696-1.pdf

ElGamal Homomorphic Encryption-Based Privacy Preserving Association Rule Mining on Horizontally Partitioned Healthcare Data

J. Inst. Eng. India Ser. B https://doi.org/10.1007/s40031-021-00696-1 ORIGINAL CONTRIBUTION ElGamal Homomorphic Encryption-Based Privacy Preserving Association Rule Mining on Horizontally Partitioned Healthcare Data Nikunj Domadiya1 • Udai Pratap Rao2 Received: 26 August 2020 / Accepted: 20 October 2021 Ó The Institution of Engineers (India) 2021 Abstract In today’s world, life-threatening diseases have become a pre-eminent issue in healthcare due to the higher mortality rate. It is possible to lower this mortality rate by utilizing healthcare intelligence to detect diseases early. Patient’s medical data is stored in the EHR system, which is kept up to date by the healthcare provider. Data mining techniques like Association Rule Mining can detect a patient’s disease from their symptoms using digital healthcare data stored in the EHR system. Association rule mining’s efficacy can be improved by using global data from various EHR systems. It mandates that all EHR systems exchange healthcare records to a central server. When personal health information is made available on an untrusted server, several privacy laws may be violated. As a result, the challenge of privacy preserving distributed healthcare data mining has become a well-known study field in the healthcare industry. This research uses an efficient ElGamal homomorphic encryption technique to protect privacy in a distributed association rule mining. The proposed approach to discover the risk factor of most life-threatening diseases like breast cancer and heart disease with its symptoms and discuss the scope for combating COVID-19. Theoretical analysis of the proposed approach shows that it is efficient and maintains privacy in an insecure communication environment. An experimental study with a real dataset shows the proposed approach’s benefit compared to the local single EHR system results. & Nikunj Domadiya 1 Computer Engineering Department, L. D. College of Engineering, Ahmedabad, India 2 Computer Engineering Department, National Institute of Technology, Surat, India Keywords Association Rule Mining  Breast Cancer Disease  Coronavirus(COVID-19)  Data Mining Privacy  Distributed Healthcare Data Mining Introduction Human life-threatening diseases are the primary focus of medical research all around the world [1]. Health researchers have recently focused a significant deal of attention on COVID-19, as well as cancer and other lifethreatening diseases. According to the 2015 National Vital Statistics Report (NVS) [2], cancer and heart disease are the two most common causes of death. Fatality rate from cancer and heart disease accounted for 45.3% of all U.S. deaths in 2010, according to the Department of Health and Human Services (Fig. 1). As the most deadly disease among women, breast cancer claims millions of lives each year in the USA. Figure 1 displays the number of cancer cases in the USA in 2018 for each of the major kinds of cancer [3]. As of May-2020, there have been 4,527,815 instances of Coronavirus disease (COVID-19), a rare disease that arose in 2019. Of those cases, 303,438 people have died [4]. Given the high mortality rate of these lifethreatening disorders, early disease detection through an examination of the patient’s symptoms is crucial to saving more lives. Appropriate treatment and recovery of these lifethreatening illnesses require early identification of the disease. Diagnostic methods for cancer and heart disease are expensive, prone to mistake, and time-consuming [5–7]. Traditionally, disease prediction relied on physician expertise rather than symptoms patterns hidden in healthcare data [8–15]. As a result, this may result in an 123 J. Inst. Eng. India Ser. B Fig. 1 Health Statistics report of USA [3] inaccurate health diagnosis, leading to inappropriate medical treatment, which raises healthcare costs by decreasing the quality of healthcare services provided to patients [16]. Electronic healthcare record (EHR) systems are utilised in large hospitals to keep digital records. It maintains a massive amount of information on patients [17]. Data acquired in hospitals can be utilised using data mining for healthcare research and to improve healthcare services. Association rule mining is a well-known data mining approach for determining disease and symptom co-relationships [18–24]. Numerous applications of association rule mining in the healthcare area include forecasting disease based on a patient’s symptoms, determining an adequate treatment for diseases, detecting medication response, and improving medical fraud detection via data mining [19, 25–29]. Association rule mining generates IFTHEN rules that medical professionals quickly understand. As a result, this approach is well-known amongst medical Fig. 2 Distributed Data Partition Model [49] 123 researchers and doctors for identifying the state of a disease or the appropriate treatment depending on the symptoms of the patient. As an outcome, the healthcare system becomes much more efficient in terms of cost and treatment [30]. Earlier, association rule mining on healthcare data could only be done on the EHR system of a single hospital [19, 31]. Only a limited number of patient records could be stored in a single electronic medical record (EMR). So association rule mining on the data of a single EHR system has less accuracy. Dangerous diseases (e.g. cancer and heart disease) demand more precise association rules [19, 25]. Accuracy/confidence in association rule mining can be increased by combining all EHR systems data at a central server. Patients’ data must be kept private in the local EHR system since there is a threat to privacy in healthcare [32, 33]. For accurate data mining, various EHR systems must share their data while protecting privacy. As a result, medical researchers have concentrated on J. Inst. Eng. India Ser. B association rule mining on distributed healthcare data that preserves privacy. As demonstrated in Fig. 2, distributed data is either vertically or horizontally partitioned. Most large hospitals use the same EHR system schema because they follow the same standards for patient information storage in hospitals. As a result, in our study, we have included data that has been horizontally partitioned among the collaborative EHR systems [34–37]. With this insight, we’re working to acquire global association rules while also safeguarding the privacy of EHR systems worldwide. UCI repository data on breast cancer and heart disease are utilised in a proposed approach for evaluating symptoms associated with both of these lives threatening diseases [38, 39]. Background and Related Concepts Distributed Healthcare Data Horizontally partitioned and vertically partitioned healthcare data are the two types of distribution of healthcare data among EHR systems. In horizontally partitioned healthcare data, all EHR systems have an equivalent schema, but store the records of different patients. Figure 2 shows the ho (...truncated)


This is a preview of a remote PDF: https://link.springer.com/content/pdf/10.1007/s40031-021-00696-1.pdf
Article home page: https://link.springer.com/article/10.1007/s40031-021-00696-1

Domadiya, Nikunj, Rao, Udai Pratap. ElGamal Homomorphic Encryption-Based Privacy Preserving Association Rule Mining on Horizontally Partitioned Healthcare Data, Journal of The Institution of Engineers (India): Series B, 2022, pp. 1-14, DOI: 10.1007/s40031-021-00696-1