Data prediction for cases of incorrect data in multi-node electrocardiogram monitoring
International Journal of Electrical and Computer Engineering (IJECE)
Vol. 12, No. 2, April 2022, pp. 1540~1547
ISSN: 2088-8708, DOI: 10.11591/ijece.v12i2.pp1540-1547
1540
Data prediction for cases of incorrect data in multi-node
electrocardiogram monitoring
Sugondo Hadiyoso1,2, Heru Nugroho1,2, Tati Latifah Erawati Rajab1, Kridanto Surendro1
1
School of Electrical and Information Engineering, Bandung Institute of Technology, Bandung, Indonesia
2
School of Applied Science, Telkom University, Bandung, Indonesia
Article Info
ABSTRACT
Article history:
The development of a mesh topology in multi-node electrocardiogram
(ECG) monitoring based on the ZigBee protocol still has limitations. When
more than one active ECG node sends a data stream, there will be incorrect
data or damage due to a failure of synchronization. The incorrect data will
affect signal interpretation. Therefore, a mechanism is needed to correct or
predict the damaged data. In this study, the method of expectationmaximization (EM) and regression imputation (RI) was proposed to
overcome these problems. Real data from previous studies are the main
modalities used in this study. The ECG signal data that has been predicted is
then compared with the actual ECG data stored in the main controller
memory. Root mean square error (RMSE) is calculated to measure system
performance. The simulation was performed on 13 ECG waves, each of
them has 1000 samples. The simulation results show that the EM method has
a lower predictive error value than the RI method. The average RMSE for
the EM and RI methods is 4.77 and 6.63, respectively. The proposed method
is expected to be used in the case of multi-node ECG monitoring, especially
in the ZigBee application to minimize errors.
Received Apr 29, 2021
Revised Sep 11, 2021
Accepted Oct 10, 2021
Keywords:
Expectation-maximization
Incorrect data
Predict
Regression imputation
Root mean square error
This is an open access article under the CC BY-SA license.
Corresponding Author:
Sugondo Hadiyoso
School of Electrical and Information Engineering, Bandung Institute of Technology
School of Applied Science, Telkom University
Bandung, Indonesia
Email:
1.
INTRODUCTION
Nowadays, there are advanced progress in applying computing technologies [1]–[8], that have
significant progress in artificial intelligence. The development of wireless communication media on the
internet of things application is always followed by the development of protocols to support multiple or
multiuser access. Multiuser monitoring or control applications have been applied in one of them in the health
area. This application allows for centralized, fast, easy, remote, and multiuser health monitoring. Health
parameters that get serious attention are the heart of this refers to the risks posed if not maintained optimally.
Observation of heart conditions can be done by studying the electrical activity of the heart through an
electrocardiogram (ECG) [9]–[11]. Previous research by Hadiyoso and Aulia [12], has succeeded in
designing and implementing an ECG monitoring system for several ZigBee-based user nodes. But in its
application, there are crucial problems, namely damage or loss of data if more than one active node is
sending data streams [13]. This problem is likely to occur because of the failure of synchronization between
the user/end node and the coordinator.
Estimating missing data is a significant advancement that occurs during the data cleaning stage.
Numerous studies have demonstrated that improper data management results in inaccurate analysis [14].
Journal homepage: http://ijece.iaescore.com
Int J Elec & Comp Eng
ISSN: 2088-8708
1541
Missing data, as indicated by the absence of data items for a subject, can obscure some potentially significant
information. In practice, missing data has emerged as a significant determinant of data quality. Thus, the
imputation of the missing value is needed [15]. Missing data is a common weakness in the classification
problem, and it can cause the prediction system’s output to be ineffective [16], [17]. Ignoring missing data
has an effect on the analysis’s results [18], [19], the outcomes of learning, as well as the outcomes of
predictions on the collaborative prediction problem [20]. In quantitative studies, missing data leads to biased
parameter estimates [21]–[24]. In the predictive model, the selection of methods for handling incorrect data
missing can affect model performance [22], [25]. Missing data are common in medical research, and if not
handled properly, they can result in a loss of statistical power and potentially biased results [26]–[28]. The
standard data collection problems may involve noiseless data. In addition to the presence of noisy data,
organizations face challenges with the presence of missing data. Missing data will affect extensive data
collection, so investigating different filtering techniques for large data environments will be extraordinary
[29]. This proposed study will not discuss or observe for the cause of the problem, rather than how to
improve or predict the incorrect data with a technique which is commonly used in the case of missing data.
This is the urgency of the research proposed to provide a reliable telemonitoring system with the smallest
possible error rate to avoid misinterpretation.
Several methods have been applied to predict missing data in various applications. In general,
missing value imputation techniques fall into two categories: Statistical and machine learning-based
techniques [30]. Expectation-maximization (EM), linear regression (LR), least squares (LS), and mean/mode
are the four statistical techniques that are most frequently used [31]. The use of EM in the imputation of
missing data has several advantages including missing data does not need to be ignored so that it can increase
information for the accuracy of diagnosis [32] and can handle many patterns of missing data [33]. Imputation
using linear regression results in a small standard deviation [34], although regression imputation is better than
average imputation but results in biased parameter estimates [35]. Expert methods such as support vector
machines (SVM) and artificial neural networks (ANN) used in data prediction were also reported in the study
[36], [37]. However, this method has high computational costs and is complex to be implemented in
computers with low memory resources.
The literature study above provides enough knowledge as a basis for the proposed study. Research
on predictions of missing data using a mathematical approach provides enough evidence to be applied to
solve problems with ZigBee-based multiuser monitoring implementation. In this study, we applied a method
to overcome the incorrect data, they are EM and regression linear imputation. This study aims to predict the
incorrect data and determine the best method between the two proposed methods. Performance analysis is
done by calculating the root mean square error (RMSE) (...truncated)