Streamflow modelling and forecasting for Canadian watersheds using LSTM networks with attention mechanism (pdf)

Article PDF cannot be displayed. You can download it here:

https://link.springer.com/content/pdf/10.1007/s00521-022-07523-8.pdf

Streamflow modelling and forecasting for Canadian watersheds using LSTM networks with attention mechanism

Neural Computing and Applications https://doi.org/10.1007/s00521-022-07523-8 (0123456789().,-volV)(0123456789(). ,- volV) ORIGINAL ARTICLE Streamflow modelling and forecasting for Canadian watersheds using LSTM networks with attention mechanism Lakshika Girihagama1 • Muhammad Naveed Khaliq1 René Roy3 • Laxmi Sushama4 • Amin Elshorbagy5 • Philippe Lamontagne1 • John Perdikaris2 • Received: 16 December 2021 / Accepted: 7 June 2022 Crown 2022 Abstract This study investigates the capability of sequence-to-sequence machine learning (ML) architectures in an effort to develop streamflow forecasting tools for Canadian watersheds. Such tools are useful to inform local and region-specific water management and flood forecasting related activities. Two powerful deep-learning variants of the Recurrent Neural Network were investigated, namely the standard and attention-based encoder-decoder long short-term memory (LSTM) models. Both models were forced with past hydro-meteorological states and daily meteorological data with a look-back time window of several days. These models were tested for 10 different watersheds from the Ottawa River watershed, located within the Great Lakes Saint-Lawrence region of Canada, an economic powerhouse of the country. The results of training and testing phases suggest that both models are able to simulate overall hydrograph patterns well when compared to observational records. Between the two models, the attention model significantly outperforms the standard model in all watersheds, suggesting the importance and usefulness of the attention mechanism in ML architectures, not well explored for hydrological applications. The mean performance accuracy of the attention model on unseen data, when assessed in terms of mean Nash–Sutcliffe Efficiency and Kling-Gupta Efficiency is, respectively, found to be 0.985 and 0.954 for these watersheds. Streamflow forecasts with lead times of up to 5 days with the attention model demonstrate overall skillful performance with well above the benchmark accuracy of 70%. The results of the study suggest that the encoder–decoder LSTM, with attention mechanism, is a powerful modelling choice for developing streamflow forecasting systems for Canadian watersheds. Keywords Streamflow forecasting LSTM Encoder-decoder architecture Attention-based models Deep learning 1 Introduction Improved streamflow forecasting capability is important for water management related activities, informing hydropower generation operations, flood risk management and operational decision-making at local and regional scales. Streamflow is the integrated result of highly nonlinear physical processes that operate at multiple temporal and & Muhammad Naveed Khaliq 1 National Research Council Canada, Ottawa, ON, Canada 2 Ontario Power Generation, Niagara Falls, ON, Canada 3 Hydro Météo, Notre-Dame-des-Prairies, QC, Canada 4 McGill University, Montreal, QC, Canada 5 University of Saskatchewan, Saskatoon, SK, Canada spatial scales within a watershed. Traditionally, streamflow forecasting is accomplished using process-based hydrological models. These models can range from simple conceptual lumped models to complex physically based distributed models. Conceptual lumped type models are based on mathematical formulations of the physical processes involved in runoff generation at the watershed scale (e.g., Streamflow Synthesis and Reservoir Regulation model [1] and Soil and Water Assessment Tool [2]). These models are considerably simplified based on reasonable assumptions and they also do not capture the spatial variability of physical processes occurring within a watershed. On the other hand, physically based distributed models can capture to some extent the spatial variability of the nonlinear physical processes occurring within a watershed (e.g., MIKE SHE model [3], WATFLOOD model [4–6], Variable Infiltration Capacity model [7], and MESH model 123 Neural Computing and Applications [8]). The precise way the process variabilities are handled in mathematical formulations can vary significantly from one model to another. Although process-based models produce deterministic and plausible results in many instances, uncertainty in parametrization and process scaling deficiencies are some of the issues that degrade their performance [9]. Undoubtedly, these models have shown great value in forecasting streamflow in many watersheds in different parts of the world, including Canada [10–14]. Though calibration and testing of a process-based model for a given watershed can be achieved with a greater detail and depth, transfer of the same model for applications across other watersheds can compromise the performance. This is due to the difficulty in the formulation of scale-dependent parameterizations of watershed relevant physical processes [15, 16] and that in turn impacts model’s generalization ability. With the growing availability of large amounts of spatial and temporal data from remote sensing and numerical weather prediction models (e.g., remotely sensed land use data, reanalyses products and real-time meteorological forecasts) and recent advances in computational power, Machine Learning (ML) methods can also offer powerful modelling options for developing data-driven streamflow forecasting systems, with generalization abilities. This is due to their ability to extract complex dynamical nonlinearities without explicitly defining the scale-relevant physical processes, as in the case of hydrological models discussed above. Hence, explicit definitions of governing equations are not needed for these models. Instead, these models map multivariate input space to an output space. Data-driven methods can be categorized as time series (statistical) methods and ML approaches. The statistical models simply derive the relationship between variables to formalize understanding and evaluation of a hypothesis about the system’s behaviour [17]. The common statistical methods under this category include autoregressive moving average models [18], autoregressive integrated moving average models [19–21], and many other variants of these time series models. In these deductive methods, the streamflow observations are assumed to be stochastic sequences and hence, future streamflow can be predicted by learning from past observations [22]. Although availability of very long records of observations are crucial for accurate prediction of future streamflow, the applicability of these models to real-time forecasting situations, however, remains limited due to lack of generalization ability and cascading uncertainty in parametrizations [22]. ML models on the other hand, have proven to overcome some of the drawbacks associated with process-based and statistical modelling approaches. These inductive models are developed based on data and are able to extract nonlinear structures from data and can readily learn from inter- 123 variable interactions. Some perspectives (...truncated)