Generation of prognostic interval estimates of water inflows to hydroelectric reservoirs using multiparametric neural network

The article discusses the practical application of the neural network for hydropower and water management systems. Various models of neural networks are understood, their advantages and disadvantages for a particular subject area. Method and operation of multiparametric neural network are described using practical examples, in particular, formation of interval estimates in reservoir of hydroelectric power station.

Earlier methods relied on the identification of internal patterns in the form of probabilistic and approximative methods that allowed for fairly reliable predictive estimates before significant global climate changes occurred. The developed multivariable neural network (MNN) with many settings (the number of hidden layers and neurons in the layer, types of sigmoidal functions, etc.) allows you to take into account climate change by finding the most significant predictors for correlation.

Classification of neural network models
There are many types of neural networks, divided by the type of training (with the teacher, without the teacher), the type of input information (analog and binary), the method of adjusting weights (fixed, dynamic) and the model of the neural network used. The implementation of MNN was considered within the framework of the used models of some types of neural networks of different directionality and functioning. Figure 1 shows the diagram of neural networks shared according to the model used.
Single-layer perceptron is the simplest direct distribution network. This type of neural network practically performs the tasks of classification and approximation, is the first neural network model proposed by F. Rozenblatt [12]. The advantages of this network are the simplicity of implementing the model and a fast-learning algorithm. The disadvantage is the ability to solve only the simplest problems. The multilayer perceptron is a modified version of the singlelayer perceptron, so it makes it possible to build more complex networks. Radial basis networks (RBF networks) are a subspecies of direct-acting neural networks that use radial basis functions as activation functions [13].

Fig. 1 Neural network models
Competitive neural networks are a specific subspecies of recurring neural networks consisting of two NS models: generating and discriminative. The goal of the generating model: to give such a result, which is as similar as possible to the original, and discriminative model: to distinguish the result of the generating model from the original as effectively as possible [14]. This subspecies of neural networks is effective in the field of cybersecurity. One of the disadvantages is the need to set up two models and balance for effective training.
The Kohonen network uses uncontrolled learning and the learning set consists of input values of variables. Due to the presence of only two layers (input and output), network data is called self-organized maps. The advantages of these networks are that they are able to operate under obstacles. The disadvantage is only the limitation of the application area, namely cluster analysis and only if the number of clusters is known in advance [15].
The Hopfield network is an associative network based on analogies of dynamic system physics. At the time of receiving the input, each node is an input, becomes hidden during learning, and then becomes an output [16]. The advantage of these networks is the presence of a processing algorithm that allows you to get out of the local minima of the adaptive terrain of the state space.
Models of adaptive resonance theory (ART) use uncontrolled learning, analyze significant input data, identify possible features and classify images in the input vector [17]. The advantage of these networks is the ability to teach the network "without a teacher." A disadvantage of this kind of network is the unlimited increase in the number of neurons during functioning.

Methods and Methodology
The methodology for long-term forecasting of natural processes was developed by I.P. Druzhinin and A.P. Reznikov [18][19][20] through the identification of hidden patterns in accumulated statistics, with the help of which it becomes possible to predict the dynamics of changes in natural processes in the long term. To account for global climate change, MNN has been developed based on the formation of prognostic estimates through the use of different sets of predictors.
For the synthesis of MNN on tributary prediction, a structure has been developed that includes many predictors (time series) for various indicators and different advance rates from the 1st month to several years. The choice of predictors is based on the use of significant correlations of the time series under study with the dynamics of change in other processes in different regions of the world and with varying advance. One important type is the calculation of the average rotor index of the vector field of atmospheric circulation speeds for a given period of atmospheric layer data with indication of coordinates of a rectangular region.
The ability to adjust thresholds of meaningful associations as potential predictors (positive and negative) allows for more accurate selection of indicators for further use. Based on the correlation result, it is assumed whether the indicator in some geographical square affects a particular time series (Fig. 2). Each set of obtained predictors is obtained individual for each time series. When implementing a single set of predictors for multiple time series, it causes a prediction error due to various kinds of noise in the operation of the MNN.
A feature of the MNN is the ability to tune not only to specific values of the process under study, but also to interval estimates, which allows reducing interference or errors associated with recording indicators in the past (for example, an error in determining the value of the useful inflow into Lake Baikal can reach several percent even in modern conditions).

Fig. 2
Example of formation of regions with significant correlation coefficients for useful inflow to Lake Baikal and swirl indices for July with advance equal to zero on atmosphere layer 500 GPa Among the many known models of neural networks (Fig. 3), MNN was developed on the basis of multilayered perceptron's. This type of network has shown itself well in the framework of forecasting interval estimates of the hydroelectric reservoir. In combination with portable technologies implemented in ESI SB RAS, the output is fairly detailed and understandable values of predictive estimates.
The neural network is implemented on the basis of the mechanism of reverse error propagation with the help of a teacher and can include up to 10 (hidden) layers with many different types of neurons in the layer. The main idea of the mechanism is to propagate error signals from the outputs of the network to its inputs, in the direction of reverse direct propagation of signals in the normal mode of operation.

Software Description
The process of functioning of MNN (Fig. 3) is tied to the formation of a set of potential predictors from a set of geoclimatic indicators with an input threshold correlation. The formation takes place using scripts written in LuaESI (Lua universal language with powerful libraries developed at ESI). When processing a set of predictors, the MNN model is generated with the specified input parameters. The core of the network is implemented in C using various auxiliary libraries. Initialization, training, verification, and prediction processes function as APIs. MNN results in probability matrices on training, verification and prognostic samples. The obtained matrices are accumulated taking into account various MNN settings and sets of predictors, and later, on the basis of accumulated variants of prognostic intervals, a final prognostic decision is made An example of the interval representation of the investigated process is given in Figure 4. The parameters of spacing are: the range of values close to the norm in the form of a fraction of the standard deviation; the number of intervals is higher and lower than normal. The multi-parameter HC is located in the Far Manager environment. To configure the MNN on the forecast of the new process, you must create a single configuration file (SCF) in the Scite text editor with internal MNN settings. MNN settings include: setting predicted time series, the number of MNN layers and neurons in the layer, the number of output prognostic intervals, the range of years for training and verification samples, the parameters of the coefficients of the activation and interval partition function and the set of input predictors. This SCF is broken down by the MNN core into variables necessary for its operation. A prognostic estimate divided into intervals is converted into an output graph using the GNU plot graphic editor.

Application Examples
As an example of the application of MNN, predictive estimates of the water content of the hydroelectric reservoir for July, August and September 2021 are presented. For each calculation period, a set of predictors is compiled, calculated by the method of maximum modulo correlation of the swirl index to the target time series. The number of elements in the set can vary from several tens to several hundred, and the presence of a huge set of predictors does not guarantee a 100% prediction result, but can reduce the error on verification and prognostic samples to a minimum. For each calculation period, more accurate adjustment of MNN is carried out empirically (setting the number of layers, neurons in the layer and output intervals, determining optimal training and verification samples, determining the value of the activation function, interval displacement coefficient, etc.). The task of using MNN is to obtain optimal values on the verification sample.
When generating long-term estimates of water inflows using MNN, a minimal error is easily achieved on the training sample. On verification samples, errors are usually much larger. The interval division of the possible range of parameter change, sufficient for practical application, can be from 3 intervals. Dividing the range of values into five intervals is usually: 2extremely high, 1-increased, 0-medium, -1-reduced, -2extremely low. As a rule, the results of MNN training with interval errors of no more than one interval on verification samples are considered acceptable. Splitting into more intervals implies a more accurate estimate of the forecast, but at the same time there is a need for more accurate adjustment of the estimated intervals. In the examples given, the average monthly inflow time series is divided into 5.7.9 intervals (Fig. 4, a, b, c,). Green indicates a zone with inflow values close to normal. The blue line indicates the actual values, "+" are the average values obtained by MNN as a result of training (from 1980 to 2011). Rhombs and rectangles indicate the values of the trained MNN obtained on the verification sample. The difference between them is that rhomb's indicate values with a probability less than 0.8. The results given in this example show an error on the verification sample of not more than 1 interval, which indicates a good result of training. For July 2021, the likely outcome is that the water in the reservoir is close to normal values. For more accurate estimates, an increase in the number of output intervals is used.

Conclusion
The presented methodology of MNN operation allows to generate prognostic intervals of water inflows to hydroelectric reservoirs with different advance time: from a month to several years, provided that a representative set of predictors is found from the database of geoclimatic indicators of the state of the atmosphere.