Research on drought prediction model of LSTM with elevation of water

. The Drought is one of the most widespread and damaging natural phenomena in the world and have been increasing around the world in recent years. A drought is a persistent shortage of water caused by an imbalance in water supply and demand. The water shortage can be manifested as insufficient precipitation, lack of soil moisture or low elevation of water of rivers and lakes. So, in this paper, according to the recent drought period and the elevation of water data of Lake Mead, the drought prediction model of the elevation of water used long short-term memory (LSTM) neural network was established to predict the elevation of water of Lake Mead in 2025, 2030, and 2050 and drought prediction respectively. The results show that the drought prediction model of the elevation of water used LSTM has the high accuracy in the data set.


Introduction
According to statistics, the average global temperature has increased by 0.85°C over the past century and global warming has led to a gradual increase in the area, intensity and frequency of droughts.Drought has become a major natural disaster in China [1].But it is difficult for people to directly quantify the conditions, intensity and degree of drought occurrence.Therefore, relevant researchers have constructed a drought index to describe and evaluate drought conditions based on the characteristics of drought occurrence and measurable empirical data.
Drought is a complex systemic problem and there are many factors that influence its occurrence.It is very important to accurately find out the driving factors that influence drought.Currently, the methods by analysing the relationships between factors and variables include empirical methods, linear correlation methods, principal component analysis and grey correlation analysis [2][3][4].But empirical methods are highly subjective and lack an objective basis, linear correlation tests cannot measure non-linear relationship, principal component analysis leads to information loss, so the principal components cannot be explained clearly and extracted, and there is lack of clear grey correlation analysis evaluation indicators.
In theory, Recurrent Neural Network (RNN) can handle all the sequences, but in practice, due to design flaws such as gradient disappearance and gradient explosion, RNN can only recall the previous steps.The LSTM [5] adds an approximate "conveyor belt" cell design by improving the implicit structure, a "gate" design for selecting information, filtering out past states, filtering out information that has a large impact on the current state, and controlling it.The result of remembering the flow of information can be a good solution to the long-term information storage and memory problem of RNN [6].Therefore, LSTM is more suitable for the predictive analysis of long-term sequences.
The definition of a drought period varies slightly from country to country and region to region according to the references.A common approach is to classify drought periods as mild, moderate or severe drought based on the number of consecutive days.Linsley et al.
(1975) define a hydrological drought as "a period of time during which river runoff is unable to meet water supply needs under a given water management system".A hydrological drought is considered to have occurred if flows remain below a certain threshold over a period of time.The choice of threshold can be based on recent high or low water levels [7].

Method
The water level in the drought period defined in the literature is from 60-95% of the highest water level, so the elevation of water can be used to make predictions for the dry period.Firstly, the LSTM method is used to predict the elevation change; secondly, the dry season pattern is defined and predictions are made for future dry seasons.The framework of this paper is shown in Figure 1.At the beginning of the work, the forgetting gate is used to control the output of the last cell state by sigmoid function combined with the input of the current state: (2) Through the function of input gate, the information determines what kind of new information is added into the cell and the new candidate cell state vector is obtained, and then the cell state value at the current time t is obtained by combining the information of the old and new states: The last output gate obtains the output of the current moment based on the combination of the past output results, the current moment input and the current cell state.

Defining the drought period
As described in the above model establishment, the measurement standard of elevation of water in dry period shall be divided into time periods.Taking the 10 years from 2011 to 2020 as the standard, the iterative threshold segmentation method is used to adaptively solve the elevation of water threshold in the drought period.The level altitude during the dry period is below 1099 feet, which was obtained during the analysis of the dry period of the past decade.The schematic diagram of the drought period in the past ten years is shown in the figure below.The most recent drought lasted from the end of 2013 to the present.Given that the problem takes several years as a unit, the most recent dry period is seen as 2014-2020.

Results and discussion
In order to evaluate the effectiveness of this method, the elevation of water at Lake Mead (in feet above sea level) at the end of each month by year from 1971 to 2021 is used and the data found at https://www.usbr.gov/lc/riverops.html.In this paper, because the annual average elevation of water is used to the LSTM, the Monthly average data was processed.The annual average data are shown in Table 1.
Table 1.The elevation of water at Lake Mead from 1971 to 2021. (2) The fitting result of the elevation of water for Lake Mead is shown if Fig. 4.  For different data, the performance of these models is often different.Therefore, this article calculates the most suitable model based on historical data, and then uses the model to make predictions.However, LSTM has the best performance in predicting the elevation of water of Lake Mead.Then, uses the elevation of water data from 2005 to 2020 for training, and the predicting effect (from 2021) is as follows.

2. 1 LSTM
LSTM model is the variant model of RNN neural network proposed by Hochreiter and Schmidhuber, which solves the problem of forgetting long-term sequence information in RNN network by constructing memory storage unit.The LSTM recurrent neural network mainly relies on the forgetting gate, input gate and output gate play a special role.Under the condition of fixed model parameters, the results of neural units at different times can be dynamically changed, thus avoiding the problem of gradient disappearance or gradient explosion.The schematic diagram of the neuro structure of the LSTM model is shown in Figure2.
which: represents the input of cell state at time step , ft represents the activation value of forgetting door at time step , it represents the activation value of input door at time step , represents the selection of cell state at time step , represents the activation value of output door at time step , and represent the output cell state at time step and ., represents the output of cell state at time step and , , , and respectively represent the weight matrix of forgetting gate, input gate, cell state and output gate; , , and respectively represent the bias term of forgetting gate, input gate, cell state and output gate.

Fig. 4 .
Fig. 4. Drought Period in the past 10 years.The mean-square error obtained of the is shown in the following table.

Fig. 5 .
Fig. 5. Prediction of the annual average elevation of water for the next thirty years.