Creating a neural network system for forecasting and managing agricultural production using autocorrelation functions of time series

The article considers the features of creating an artificial neural network (ANN) for modelling and forecasting the dynamics of long-term time series (TS) levels of grain yield in arid conditions on the example of the Lower Volga region of the Russian Federation. In order to increase the validity of the choice of architecture and macroparameters developed by ANN, statistical characteristics of the simulated TS were analysed. The autocorrelation function of distribution of levels of long-term series of grain yields is constructed. It is proposed to take into account the characteristics of time lags of autocorrelation functions when selecting ins macroparameters for predicting BP yield. On the basis of preliminary statistical analysis, "peaks" corresponding to the time lags of the autocorrelation function, whose values are determined for different groups of grain crops, are identified. The obtained values are recommended to be taken into account when selecting the value of the time window parameter when constructing neural network models of productivity. This is the basis of the proposed information technology for building ins for predicting crop yields. The results of neural network modelling and yield forecasting can be successfully used for managing agricultural production, including in the arid conditions of the Lower Volga region of the Russian Federation.


Introduction
Agricultural production is characterized by a significant dependence on hydrothermal conditions, especially in cases of insufficient moisture, typical, for example, for the Lower Volga region of Russia. Accurate crop yield forecasts are important for managing agricultural production. There are various methods for predicting TS yields [1,Error! Reference source not found., 5,6,7,10], but their accuracy is insufficient.
Cultivation of agricultural crops in conditions of insufficient water supply characterized by a high value of the coefficient of variation of time series (TS) of yield levels, reaching values of 0.3 and more for the main grain crops, which limits the use of factor trendseasonal models. The noted feature of the yield TS causes a methodological error in assessing the risks associated with planning and forecasting agricultural production based on various methods of nonlinear dynamics [1-Error! Reference source not found.]. The use of modern methods of nonlinear dynamics, in particular neural networks, allows us to increase the adequacy of the resulting models.
Methods for modeling complex socio-economic systems based on neural network technologies are becoming more common due to the versatility and possibility of computer implementation using specialized computer programs, among which one of the most convenient is STATISTICA Neural Networks (SNN). An important advantage of SNN is its close integration with the statistical analysis computer program STATISTICA, which allows developing hybrid technologies that include preliminary statistical analysis and neural network data modeling, using a wide range of built-in tools [Error! Reference source not found., 3].
Neural network modeling technologies are based on the construction of artificial neural networks that allow describing the dynamics of simulated nonlinear systems and modelling the processes of interannual yield fluctuations. Modern approaches include artificial intelligence (AI) methods, including deep machine learning [Error! Reference source not found.], cognitive modelling (construction and parameterization of fuzzy cognitive maps) [Error! Reference source not found., Error! Reference source not found.], and modelling based on artificial neural networks (ANN) [3,8,13].
The main objective of the study is to increase the reliability of neural network modelling and, accordingly, to predict the levels of BP yields by taking into account internal patterns of dynamics of changes in yield in previous years, including the use of autocorrelation analysis of long-term BP yields.

Research methods
For research was used by TS as grain in General, and individual cereal crops, selected on the data of Federal service of state statistics of Russia for the Volgograd region (Volgogradstat) for the period 1950-2018 years, a fragment of which is for TS "cereal crops" are presented in table.1. Previously, a statistical test of the null hypothesis about the correspondence of the empirical law of yield distribution to the normal one was performed using two different statistical criteria (Pearson's Chi-square, and Kolmogorov-Smirnov). Artificial neural networks were built using the STATISTICA Neural Networks (SNN) application package, the advantage of which is integration with the statistical analysis computer program STATISTICA. This approach simplifies the development of hybrid technologies that include preliminary statistical analysis and neural network modeling of the studied time series.

Pre-forecast statistical analysis of time series
For a sample of the yield of "grain crops as a whole", the calculated value of the Pearson criterion exceeded 51, which is significantly higher than the critical value of 9.5, which was determined from the table at (α = 0.05; σ = 4). Analysis of the data presented in the table.1, showed that the empirical law of distribution of long-term values of yield levels for grain crops is statistically significantly different from the normal distribution for both criteria used. During the research, the regularities of the distribution of the levels of time cycles of long-term yield levels for other crops from the grain group were revealed (Fig. 1).
In order to justify the architecture and macroparameters of mathematical models of grain yield in arid conditions, which must be taken into account when constructing ins, and to increase the reliability of the obtained forecasts, an autocorrelation analysis of the studied BP was performed.
The results of autocorrelation analysis for the group "grain crops in General" are shown in Fig. 2.
The analysis of the obtained autocorrelation functions of the yield TS performed for different groups of grain crops shows both the presence of significant cyclic components and a significant difference in their statistical characteristics. In accordance with this, in the subsequent simulation of TS yields of each of these crops using ANN, it is necessary to perform taking into account the obtained statistical characteristics.
The autocorrelation function diagrams showed statistically significant peaks at one-, two-, three -, four -, and twelve-year lags (Fig. 2). Note that the most pronounced "peaks" are observed at lags of three and twelve years. A possible hypothesis for explaining the 12th cycle may be a superposition of two-and three-year cycles. Thus, the cyclical time series of grain yields, confirmed in the studies for the above years, as an attribute of the endogenous dynamics of the considered economic TS on the example of yield, should be used in neural network forecasting. This result should also be taken into account when planning agricultural production.
Thus, for predicting TS yields using ins, we can recommend the use of pre-forecast methods of statistical analysis, the algorithms of which are sufficiently developed and implemented as built-in functions in many computer programs. At the same time, the main tasks of the preliminary analysis are to justify the numerical values of the ANN macroparameters in the formation of its architecture even before the start of the training procedure.

Building and training artificial neural networks
The practical implementation of the developed ins models was carried out by means of computer mathematics in the SNN v4.0 environment. -The interface of a neural network system for predicting the yield of grain crops based on SNN v.4 is shown in the Fig. 3. In the process of selecting neural network parameters, the "Time window" parameter of the Create Network dialog box was assumed to be equal to the number of the most pronounced lag on the correlogram (Fig.2). In the case of the "grain in General" TS simulation, the "Time window" parameter was assumed to be 3.0. The ANN architecture included a threelayer perceptron according to the recommended [Error! Reference source not found., 5] methods (Fig. 3).
The selection of the initial BP from the yield levels and the formation of training and verification samples, as well as the training of the created ANN, were carried out in a partially automated mode [Error! Reference source not found.]. As a result of performing the ins training procedure with built-in SNN tools, a family of preferred ins architectures and macroparameters was obtained.
To analyze the predictive characteristics of the trained network, various time series projections were used in the "Time Series Projection" window of the SNN program. These projections characterize the quantitative and qualitative possibilities for obtaining forecasts with different initial values of BP levels and forecast horizons. The conducted research allows us to recommend the resulting family of neural network models for short-term forecasting with a horizon of 1-2 years, which can be performed directly in the "Run Single Case" window.

Conclusion
Thus, the proposed information technology for short-term modelling of TS levels of crop yields using ANN is based on a pre-forecast autocorrelation analysis of long-term TS levels of crop yields. This approach makes it possible to more reasonably take the value of the characteristic parameter of the time window "Steps" when selecting the ANN macroparameters, which reduces the error of short -and medium-term forecasting of grain yield in the arid conditions of the Lower Volga region. The achieved error value of 6 ... 13% is quite acceptable for forecasting such TS and planning agricultural production for the next 1-2 economic years.