Using discrete wavelet transform for optimizing COVID-19 new cases and deaths prediction worldwide with deep neural networks
PLOS ONE
RESEARCH ARTICLE
Using discrete wavelet transform for
optimizing COVID-19 new cases and deaths
prediction worldwide with deep neural
networks
Erick Giovani Sperandio Nascimento ID1,2*, Júnia Ortiz1, Adhvan Novais Furtado1,
Diego Frias ID3
a1111111111
a1111111111
a1111111111
a1111111111
a1111111111
1 Manufacturing and Technology Integrated Campus–SENAI CIMATEC, Salvador, Bahia, Brazil,
2 Department of Electrical & Electronic Engineering, Surrey Institute for People-Centred AI, Faculty of
Engineering and Physical Sciences, University of Surrey, Guildford, United Kingdom, 3 State University of
Bahia–UNEB, Salvador, Bahia, Brazil
* , .
Abstract
OPEN ACCESS
Citation: Sperandio Nascimento EG, Ortiz J,
Furtado AN, Frias D (2023) Using discrete wavelet
transform for optimizing COVID-19 new cases and
deaths prediction worldwide with deep neural
networks. PLoS ONE 18(4): e0282621. https://doi.
org/10.1371/journal.pone.0282621
Editor: Muhammad Fazal Ijaz, Sejong University,
REPUBLIC OF KOREA
Received: November 9, 2022
Accepted: February 20, 2023
Published: April 6, 2023
Copyright: © 2023 Sperandio Nascimento et al.
This is an open access article distributed under the
terms of the Creative Commons Attribution
License, which permits unrestricted use,
distribution, and reproduction in any medium,
provided the original author and source are
credited.
Data Availability Statement: Time series of daily
new cases and deaths were retrieved from the
COVID-19 Data Repository by the Center for
Systems Science and Engineering (CSSE) at Johns
Hopkins University: https://coronavirus.jhu.edu/
map.html. Model developed in this work is
available at github: https://github.com/CRIACIMATEC/covid-19/tree/master/ts_models_
covid19. The corresponding test dataset is also
available at the same address.
This work aims to compare deep learning models designed to predict daily number of cases
and deaths caused by COVID-19 for 183 countries, using a daily basis time series, in addition to a feature augmentation strategy based on Discrete Wavelet Transform (DWT). The
following deep learning architectures were compared using two different feature sets with
and without DWT: (1) a homogeneous architecture containing multiple LSTM (Long-Short
Term Memory) layers and (2) a hybrid architecture combining multiple CNN (Convolutional
Neural Network) layers and multiple LSTM layers. Therefore, four deep learning models
were evaluated: (1) LSTM, (2) CNN + LSTM, (3) DWT + LSTM and (4) DWT + CNN +
LSTM. Their performances were quantitatively assessed using the metrics: Mean Absolute
Error (MAE), Normalized Mean Squared Error (NMSE), Pearson R, and Factor of 2. The
models were designed to predict the daily evolution of the two main epidemic variables up to
30 days ahead. After a fine-tuning procedure for hyperparameters optimization of each
model, the results show a statistically significant difference between the models’ performances both for the prediction of deaths and confirmed cases (p-value<0.001). Based on
NMSE values, significant differences were observed between LSTM and CNN+LSTM, indicating that convolutional layers added to LSTM networks made the model more accurate.
The use of wavelet coefficients as additional features (DWT+CNN+LSTM) achieved equivalent results to CNN+LSTM model, which demonstrates the potential of wavelets application
for optimizing models, since this allows training with a smaller time series data.
1. Introduction
Facing crises such as the COVID-19 pandemic requires constant monitoring of the social and
epidemiological variables that involve it. The COVID-19 pandemic has changed the daily lives
PLOS ONE | https://doi.org/10.1371/journal.pone.0282621 April 6, 2023
1 / 34
PLOS ONE
Wavelet transform for optimizing COVID-19 new cases and deaths prediction worldwide with deep neural networks
Funding: This research was partially funded by
ABDI, SENAI, EMBRAPII and REPSOL SINOPEC
BRASIL under grant “Missão contra a COVID-19
do Edital de Inovação para a Indústria”, and by
Bahia State Research Support Foundation
(FAPESB) under grant CNV0002/2015. In addition,
we gratefully acknowledge the support of SENAI
CIMATEC Reference Center on Artificial Intelligence
and SENAI CIMATEC/NVIDIA AI Joint Lab for the
scientific and technical support, and the SENAI
CIMATEC Supercomputing Center for Industry
Innovation for granting access to the necessary
hardware and technical support. There was no
additional external funding received for this study.
The funders had no role in study design, data
collection and analysis, decision to publish, or
preparation of the manuscript.
Competing interests: The authors have declared
that no competing interests exist.
of the entire world population, which has been threatened by an easily transmitted virus, the
severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2). Several strategies must be
taken to hinder the virus proliferation, and these need to be based on correct and scientifically
proven information, which can be greatly benefited by prior knowledge of possible scenarios
to be observed in the future. The daily, weekly, or monthly contagion rate projection to the
future helps the decision-making process from managers who need to formulate mitigation
policies, which is important in the face of changes in the virus spreading behavior as seen with
Delta and Omicron variants.
Since the pandemic began, several predictive modeling studies have been carried out to
investigate and build the best tools to support the development of strategies to combat the crisis. The proposed approaches include epidemiological mathematical modeling, such as the
SIRD model (Susceptible-Infected-Recovered-Deceased) [1]; autoregressive models, such as
ARIMA (Autoregressive Integrated Moving Average) [2]; and deep learning models, such as
recurrent neural networks of the type LSTM (Long-Short Term Memory) [3], GRUs (Gated
Recurrent Units) [4], bidirectional and convolutional LSTMs [5], among others.
The literature review points out that deep learning models based on LSTM and CNN networks are the most effective and robust, allowing to make more accurate predictions in a
greater number of countries and in a longer predictive horizon than the autoregressive and statistical methods [6, 7]. Arun Kumar et al. [6] did a comparative analysis of autoregressive and
neural models to predict trends in COVID-19. They concluded that, by citing several works
with similar results, the LSTM and GRU deep learning-based models outperformed the
ARIMA and SARIMA statistical models for most countries.
However, the accuracy of these models, as well as the size of the forecast horizon, need to
improve so that they can be adopted by public health agencies as a tool to support the management not only of the COVID-19 pandemic, but of future ones. In this sense, Jamishid et al.
2022 [7], after reviewing 80 papers on methods for estimating and predicting the spread of
COVID-19 concluded not to recomm (...truncated)