Using discrete wavelet transform for optimizing COVID-19 new cases and deaths prediction worldwide with deep neural networks

PLOS ONE, Apr 2023

This work aims to compare deep learning models designed to predict daily number of cases and deaths caused by COVID-19 for 183 countries, using a daily basis time series, in addition to a feature augmentation strategy based on Discrete Wavelet Transform (DWT). The following deep learning architectures were compared using two different feature sets with and without DWT: (1) a homogeneous architecture containing multiple LSTM (Long-Short Term Memory) layers and (2) a hybrid architecture combining multiple CNN (Convolutional Neural Network) layers and multiple LSTM layers. Therefore, four deep learning models were evaluated: (1) LSTM, (2) CNN + LSTM, (3) DWT + LSTM and (4) DWT + CNN + LSTM. Their performances were quantitatively assessed using the metrics: Mean Absolute Error (MAE), Normalized Mean Squared Error (NMSE), Pearson R, and Factor of 2. The models were designed to predict the daily evolution of the two main epidemic variables up to 30 days ahead. After a fine-tuning procedure for hyperparameters optimization of each model, the results show a statistically significant difference between the models’ performances both for the prediction of deaths and confirmed cases (p-value<0.001). Based on NMSE values, significant differences were observed between LSTM and CNN+LSTM, indicating that convolutional layers added to LSTM networks made the model more accurate. The use of wavelet coefficients as additional features (DWT+CNN+LSTM) achieved equivalent results to CNN+LSTM model, which demonstrates the potential of wavelets application for optimizing models, since this allows training with a smaller time series data.

Using discrete wavelet transform for optimizing COVID-19 new cases and deaths prediction worldwide with deep neural networks

PLOS ONE RESEARCH ARTICLE Using discrete wavelet transform for optimizing COVID-19 new cases and deaths prediction worldwide with deep neural networks Erick Giovani Sperandio Nascimento ID1,2*, Júnia Ortiz1, Adhvan Novais Furtado1, Diego Frias ID3 a1111111111 a1111111111 a1111111111 a1111111111 a1111111111 1 Manufacturing and Technology Integrated Campus–SENAI CIMATEC, Salvador, Bahia, Brazil, 2 Department of Electrical & Electronic Engineering, Surrey Institute for People-Centred AI, Faculty of Engineering and Physical Sciences, University of Surrey, Guildford, United Kingdom, 3 State University of Bahia–UNEB, Salvador, Bahia, Brazil * , . Abstract OPEN ACCESS Citation: Sperandio Nascimento EG, Ortiz J, Furtado AN, Frias D (2023) Using discrete wavelet transform for optimizing COVID-19 new cases and deaths prediction worldwide with deep neural networks. PLoS ONE 18(4): e0282621. https://doi. org/10.1371/journal.pone.0282621 Editor: Muhammad Fazal Ijaz, Sejong University, REPUBLIC OF KOREA Received: November 9, 2022 Accepted: February 20, 2023 Published: April 6, 2023 Copyright: © 2023 Sperandio Nascimento et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. Data Availability Statement: Time series of daily new cases and deaths were retrieved from the COVID-19 Data Repository by the Center for Systems Science and Engineering (CSSE) at Johns Hopkins University: https://coronavirus.jhu.edu/ map.html. Model developed in this work is available at github: https://github.com/CRIACIMATEC/covid-19/tree/master/ts_models_ covid19. The corresponding test dataset is also available at the same address. This work aims to compare deep learning models designed to predict daily number of cases and deaths caused by COVID-19 for 183 countries, using a daily basis time series, in addition to a feature augmentation strategy based on Discrete Wavelet Transform (DWT). The following deep learning architectures were compared using two different feature sets with and without DWT: (1) a homogeneous architecture containing multiple LSTM (Long-Short Term Memory) layers and (2) a hybrid architecture combining multiple CNN (Convolutional Neural Network) layers and multiple LSTM layers. Therefore, four deep learning models were evaluated: (1) LSTM, (2) CNN + LSTM, (3) DWT + LSTM and (4) DWT + CNN + LSTM. Their performances were quantitatively assessed using the metrics: Mean Absolute Error (MAE), Normalized Mean Squared Error (NMSE), Pearson R, and Factor of 2. The models were designed to predict the daily evolution of the two main epidemic variables up to 30 days ahead. After a fine-tuning procedure for hyperparameters optimization of each model, the results show a statistically significant difference between the models’ performances both for the prediction of deaths and confirmed cases (p-value<0.001). Based on NMSE values, significant differences were observed between LSTM and CNN+LSTM, indicating that convolutional layers added to LSTM networks made the model more accurate. The use of wavelet coefficients as additional features (DWT+CNN+LSTM) achieved equivalent results to CNN+LSTM model, which demonstrates the potential of wavelets application for optimizing models, since this allows training with a smaller time series data. 1. Introduction Facing crises such as the COVID-19 pandemic requires constant monitoring of the social and epidemiological variables that involve it. The COVID-19 pandemic has changed the daily lives PLOS ONE | https://doi.org/10.1371/journal.pone.0282621 April 6, 2023 1 / 34 PLOS ONE Wavelet transform for optimizing COVID-19 new cases and deaths prediction worldwide with deep neural networks Funding: This research was partially funded by ABDI, SENAI, EMBRAPII and REPSOL SINOPEC BRASIL under grant “Missão contra a COVID-19 do Edital de Inovação para a Indústria”, and by Bahia State Research Support Foundation (FAPESB) under grant CNV0002/2015. In addition, we gratefully acknowledge the support of SENAI CIMATEC Reference Center on Artificial Intelligence and SENAI CIMATEC/NVIDIA AI Joint Lab for the scientific and technical support, and the SENAI CIMATEC Supercomputing Center for Industry Innovation for granting access to the necessary hardware and technical support. There was no additional external funding received for this study. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript. Competing interests: The authors have declared that no competing interests exist. of the entire world population, which has been threatened by an easily transmitted virus, the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2). Several strategies must be taken to hinder the virus proliferation, and these need to be based on correct and scientifically proven information, which can be greatly benefited by prior knowledge of possible scenarios to be observed in the future. The daily, weekly, or monthly contagion rate projection to the future helps the decision-making process from managers who need to formulate mitigation policies, which is important in the face of changes in the virus spreading behavior as seen with Delta and Omicron variants. Since the pandemic began, several predictive modeling studies have been carried out to investigate and build the best tools to support the development of strategies to combat the crisis. The proposed approaches include epidemiological mathematical modeling, such as the SIRD model (Susceptible-Infected-Recovered-Deceased) [1]; autoregressive models, such as ARIMA (Autoregressive Integrated Moving Average) [2]; and deep learning models, such as recurrent neural networks of the type LSTM (Long-Short Term Memory) [3], GRUs (Gated Recurrent Units) [4], bidirectional and convolutional LSTMs [5], among others. The literature review points out that deep learning models based on LSTM and CNN networks are the most effective and robust, allowing to make more accurate predictions in a greater number of countries and in a longer predictive horizon than the autoregressive and statistical methods [6, 7]. Arun Kumar et al. [6] did a comparative analysis of autoregressive and neural models to predict trends in COVID-19. They concluded that, by citing several works with similar results, the LSTM and GRU deep learning-based models outperformed the ARIMA and SARIMA statistical models for most countries. However, the accuracy of these models, as well as the size of the forecast horizon, need to improve so that they can be adopted by public health agencies as a tool to support the management not only of the COVID-19 pandemic, but of future ones. In this sense, Jamishid et al. 2022 [7], after reviewing 80 papers on methods for estimating and predicting the spread of COVID-19 concluded not to recomm (...truncated)


This is a preview of a remote PDF: https://journals.plos.org/plosone/article/file?id=10.1371/journal.pone.0282621&type=printable
Article home page: https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0282621

Erick Giovani Sperandio Nascimento, Júnia Ortiz, Adhvan Novais Furtado, Diego Frias. Using discrete wavelet transform for optimizing COVID-19 new cases and deaths prediction worldwide with deep neural networks, PLOS ONE, 2023, Volume 18, Issue 4, DOI: 10.1371/journal.pone.0282621