Prediction of time series of overhead lines failure rate with chaotic indicators

The results of forecasting the failure rate (failure frequency) of overhead lines (OHL) 500 kV, presented in the form of a time series with signs of chaos, are presented. Predictive estimates are obtained using methods of singular spectrum analysis, neural and fuzzy neural networks. As an object of singular spectrum analysis, a delay matrix is used, which is formed on the basis of the time series of the failure rate. The prediction was carried out by means of one-step transformations of the initial data. For prediction using a neural network, a direct signal transmission network is used, trained by the backpropagation method. In order to achieve the minimum mean squared error, the training sample contained the maximum possible history. To predict the failure rate by the method of fuzzy neural networks, the Wang-Mendel network was chosen. In all prediction cases, within the framework of one prediction year, 10 thousand "training - prediction" cycles were performed, which ensured the stationarity property of the histograms of the failure rate distributions.


Introduction
In [1], the cyclicity of accident rate of 500 kV OHL in a large region over an extended time interval was researched. Significant fluctuations in the values of their failure rate (failure frequency) were revealed under the influence of natural and socio-economic factors (Fig. 1); it is proposed to consider the specified parameter as the output signal of a dynamic system with many difficult to formalize inputs.
In [2] the time series in Fig. 1 was analyzed using the mathematical apparatus of the theory of deterministic (dynamic) chaos. The chaotic nature of the behavior of the dynamical system under consideration is revealed by the fractality characteristic of this series, as well as the positivity of the maximal Lyapunov exponent λ. It is shown that the prediction period in such problems is T≈1/λ. In this case, T≈1/0.2183 = 4.6 (i.e. about 5) years. In the theory of dynamical systems, there is a variety of methods for analyzing and predicting time series, including those of a chaotic nature. In [2], for these purposes, only one of them was applied -the method of singular spectrum analysis (SSA).
It is well known that SSA refers to global prediction methods and is used to distinguish periodic and quasiperiodic components from a time series. It can be classified as a traditional, well-established regression method. Among the alternative prediction approaches, we will single out methods related to artificial intelligence, in particular, neural and fuzzy neural networks. Their application for predicting the accident rate of 500 kV OHL and comparison with the SSA method revealed some regularities that are useful to take into account when substantiating the reliability of estimates of the reliability of the main electrical grid of power system. The relevant arguments are presented below.

Singular spectrum analysis
Most of the methods related to the analysis and prediction of time series are based on its multidimensional representation in the form of a delay matrix -a set of copies of the series taken with a lag, i.e. after a certain period of time.
At the first step of the SSA, the original time series of length N (in Fig. 1 N = 45) is converted into a sequence of multidimensional vectors -vectors of embeddings of dimension L; 1<L<N. Let us assume L≈N/2=45/2≈22 [3], and the number of columns of the delay matrix K=N-L+1=45-22+1=24. Thus, the matrix for the time series in Fig. 1  where -a diagonal matrix of singular values (eigenvalues) ordered in descending order, defining the principal components; V -orthogonal matrix of singular vectors; The matrix is rectangular. The number of its columns (24) is greater than the number of rows (22). Because of this, some of its columns are linear combinations of the remaining columns. The latter leads to the degeneracy of the matrix X and, accordingly, to zero of its determinant.
Degenerate matrices must have zero eigenvalues and singular values [4]. In the case under consideration, singular values are present in the diagonal matrix . As a result, it is not possible to construct a polynomial of the so-called linear regression formula (LRF), which consists of linear combinations of products of exponentials, polynomials and harmonics, and controls the behavior of the time series.
The reason is that in order to obtain the coefficients of the LRF polynomial, it is required to divide by singular numbers, in our case, by zero [5].
Note that SSA is similar to the Fourier transform. Here, the original series is also represented as a set of components. Only in SSA they are generally not harmonic.
Thus, when processing the time series shown in Fig.  1, we had to use one-step (rather than multi-step) predictive formulas and matrix updates after each prediction step. In fact, we are talking about using the idea of the local approximation method [6]. Its advantage lies in the application of the piecewise linear approximation, which is expressed in this case in onestep (recurrent) prediction instead of the global linear approximation, which gives the LRF.
In accordance with the technique described in [3], on the basis of the singular value decomposition (more often -SVD-decomposition) of the symmetric matrix X, an ordered set of singular numbers is determined. The dominant singular numbers, there were six of them, determined the choice of the appropriate number of principal components for singular analysis. As a result, the matrix V in the SVD-decomposition X was reduced column-wise to a size of 24x6 and took the form where stands for the last row of the matrix V * ; Calculations of predictive estimates of the failure rate with a depth of five (see above) years using the formulas with matrices V and V * updated after each step, gave the following values of the failure rate of 500 kV OHL for the period 2019-2023 (Table 1):

Neural network method
It is known that one of the possible applications of neural networks is to predict the behavior of a dynamic system, the structure and parameters of which are not known, based on the signal previously generated by it (in our case, the time series in Fig. 1, which has signs of chaos). Without going into the methodology of neural networks (it is widely presented in specialized literature), as well as taking into account the features described in [7], this paper involves neural networks with two or more layers, with direct signal transmission, i.e. without feedback.
The used neural network is represented programmatically in the Matlab environment by the newff function. The latter is intended to create, in the general case, multilayer neural networks with given training and tuning functions, which use the backpropagation method. Optimization of the neural network parameters was carried out by software built-in Quasi-Newton Levenberg-Marquadt method, which combines the advantages of the steepest descent method and the Gauss-Newton method [8,9].
A delay matrix is also used as a training data set for the neural network. During SSA, its dimensions were formed to achieve good conditioning (scalability) as a rectangular matrix [4], i.e. ensuring that the ratio of the maximum and minimum singular numbers in the SVDdecomposition of this matrix is as close to unity as possible. For a neural network, as shown by a computational experiment, the training set requires taking into account the maximum possible history to achieve the minimum mean square error. In this case, the number of columns of the delay matrix was a priori set equal to the prediction horizon (five years). Thus, for the prediction for 2019, this matrix had a dimension of 40×5: Its first 39 lines were used to form the training inputs of the neural network, and the last line was used to form "reference" outputs. Thus, after completing the training procedure, the output signal of the neural network should coincide with the "standard" with a vanishingly small error. To train the neural network, the train function in the Matlab environment was used. For each next prediction year, a row was added to the delay matrix, obtained as a result of adding the prediction of the OHL failure rate for the current year, namely, 2020 -41×5, 2021 -42×5, … Prediction of the failure rate of 500 kV OHL, as in the case of SSA, went one step (year) ahead. The input signal of the "trained" neural network, for example, for 2019, was the value of the failure rate in 2018; the output signal is the forecast for 2019, calculated by the sim function in the Matlab environment, and when forecasting, for example, for 2020, the prediction of the failure rate for 2019, etc.
Within one prediction year, the neural network in the "training -forecast" cycle gave different values of the failure rate (Fig. 2, a). Therefore, the aspect of choosing the number of tests of the neural network will be touched upon. First, according to the Monte Carlo method, the accuracy of calculations in the processing of random variables is proportional to the ratio [10] √ ‫ﻉ‬ / , where D  -variance of random variable ; n -number of tests.
As follows from this formula, the specified accuracy depends on √ −1 . Choosing n=10 4 , for example, compared to n=10 3 with the same variance, increases the accuracy by more than three times, since the value of √ −1 decreases from 0.0316 to 0,01. Secondly, it was found that, starting from approximately n=10 4 , the histograms of the failure rate distributions become weakly dependent on their number, i.e. they acquire the property of stationarity. Fig. 2, b shows histograms of the prediction of a twolayer neural network for 2019 with 10 4 experiments and a bin of 0,05 1/(year 100 km). As can be seen from this figure, the predictive estimate of the failure rate can be in a wide range. Most likely (over 35% of the "trainingprediction" cycles) it will be in the range from 0,1 to 0,15 1 (year 100 km).  Fig. 3 shows a prediction histogram of a ten-layer neural network for 2019 with 10 5 experiments. As seen from Fig. 3, the more varied and saturated the structure of the neural network becomes, the more "degrees of freedom" it allows itself, in particular, negative values of the failure rate.
The overall result of the use of a highly developed neural network, trained on time series with signs of chaotic dynamics, is to obtain a prediction described (with normalization of the histogram) by a normal Gaussian distribution of random variables -compared with a histogram created by a pseudo-random number generator.
The generalized characteristics in the form of the mathematical expectation M and the standard deviation σ of the predicted values of the failure rate of a 500 kV OHL, calculated using ten-layer neural networks with n=10 4 tests, are equal to ( Table 2):  Analyzing the obtained predictive estimates, we note that the neural network method associated with artificial intelligence really reflected human psychology. The prediction value can be in a very wide range with more or less probability. And in some ways, "intelligence" has surpassed the human, predicting negative values of the failure rate of OHL.

Fuzzy neural networks
These networks are known to combine the methods of neural networks and fuzzy logic systems. Thus, the properties of a neural network are enhanced by the advantage of fuzzy logic -the ability to use expert knowledge about the structure of an object in the form of linguistic expressions of the following type: if the "inputs" are such and such, then the "outputs" are such and such. However, fuzzy logic algorithms themselves do not contain built-in learning and self-organization mechanisms. Therefore, the solutions obtained with their help depend on the type of so-called membership functions, which formalize fuzzy terms -qualitative descriptions of parameter values, for example, of the type "little", "many", " a great many", etc.).
To predict the failure rate of 500 kV OHL, one of the simplest Wang -Mendel fuzzy neural networks was chosen (a special case of the Sugeno -Takagi -Kanga network) [11, 12, etc.]. It is implemented in the Matlab environment using the ANFIS program.
When solving the problem of predicting the accident rate of OHL using the ANFIS program, as before, a training sample was used in the form of a delay matrix for neural networks. At the same time, the formation of membership functions (trapezoidal ones were taken) was carried out on the basis of expert information, as which the results of prediction by the methods of SSA and neural networks were used.
ANFIS prediction, as before, was carried out one step (year) ahead with 10 4 experiments for each step of the five-year prediction horizon. It was not possible to obtain distributions close to Gaussian. The reason lies in the fact that none of the membership functions appearing in the predictors allowed negative values of the failure rate of OHL. Fig. 4, as an example, a histogram of the prediction of the failure rate of the 500 kV OHL for 2019 is shown, and below (Table 4) is the prediction results for subsequent years: As of this writing, 2019 has ended. Therefore, it has been possible to compare projected and actual data for the past year. The processing of statistical data for 2019 revealed the failure rate of 500 kV OHL in the region under consideration at the level of 0,1 1 /(year 100 km) with a prediction of 0,12 1 /(year per 100 km) using the SSA method and 0,1-0,15 1/(year per 100 km) (most likely) by methods of two-layer neural and fuzzy neural networks, which is to a certain extent an adequate estimate. The mathematical expectation of the OHL failure rate, issued for 2019 by a ten-layer neural network, was 0,189 1/(year 100 km) -in fact, an erroneous prediction.
For the period of five years, the SSA method gives an approximately threefold increase in the accident rate, following the long-term trends associated in part with the cycles of solar activity (see [1,13]). Neural and fuzzy neural networks offer more favorable, stable predictions of the failure rate of OHL in the main electrical grid of power systems.

Conclusion
The chaotic dynamics of the failure rate of the 500 kV OHL makes it problematic to predict the accident rate of the main electrical grid of power systems and reduces the reliability of their reliability estimates.
Predictive estimates of the 500 kV OHL failure rate, obtained with a sufficiently large number of experiments based on a "highly developed" neural network, actually led to the fulfillment of the conditions of the central limit theorem, according to which functions of a large number of weakly dependent quantities have probability distributions close to the normal Gaussian law ...
The normal Gaussian distribution of the failure rate of a 500 kV OHL, presumably, is an additional characteristic of the chaotic nature of the dynamic process under consideration, since the neural network was trained on the numerical characteristics of the time E3S Web of Conferences 216, 01016 (2020) RSES 2020 https://doi.org/10.1051/e3sconf/202021601016 series of the failure rate, which has signs of chaos (fractality and positiveness of the maximum Lyapunov exponent).
Methods for predicting the time series of the 500 kV OHL failure rate based on the use of regression (singular spectrum analysis) and artificial intelligence (neural and fuzzy neural networks) give different estimates. And they can be confirmed or refuted after the upcoming five-year period.