Long-term load combination forecasting method considering the periodicity and trend of data

In order to solve the problems of insufficient accuracy of long-term power load forecasting and poor applicability of the model, this paper considers the coupling of a number of macro indicators, such as regional economic development and social development indicators, with the time series data of regional power load. BP neural network and Autoregressive integrated moving average model (ARIMA) are used to integrate and improve the forecasting model, so as to improve the trend forecasting ability of annual load forecasting model. The non parametric function method is used to forecast the periodic load data in the monthly load data, the annual load forecast is combined with the monthly load forecast to improve the overall forecasting accuracy of the model. Finally, through the comparison of grey prediction and other models and the verification of MAPE error analysis method, the prediction accuracy of the model method considering the combination of data periodicity and trend is significantly improved, which is suitable for the long-term prediction of regional power load.


Introduction
The accurate prediction of long-term power load has important strategic significance for power network planning and power infrastructure construction. In recent years, long-term power load prediction is mainly based on the overall trend prediction of annual load value [1] , and less consideration is given to the inertia growth of data, periodic changes and the cumulative effect of numerical values in the prediction, thus affecting the accuracy of load prediction. Therefore, the model modeling needs to consider that the annual load growth should be based on the monthly load accumulation, so as to improve the segmenting accuracy of the model interval prediction and ensure the accuracy of the prediction. Based on this work, the combined method of data trend and periodic prediction will be studied.
Power load forecasting methods include parameterized forecasting methods and nonparametric forecasting methods. The analysis algorithms based on data parameterization characteristics mainly include mechanical learning algorithm and time series algorithm. Mechanical learning, such as neural network learning [2] , SVM support vector machine [3] , decision tree. Time series ARIMA and other methods realize the analysis and prediction of the development trend of data itself through manual parameter setting. In order to improve the prediction accuracy of time series data, a variety of machine learning algorithms can be combined with ARIMA method to realize the coupling correction of data [4] . Nonparametric method is a kind of response analysis method obtained directly or indirectly from the experimental analysis of the actual system, such as the impulse response or step response of the system obtained from the experimental records [5] . This kind of algorithm can fully mine the periodic characteristics of the data itself, and predict the trend of the data in the interval unit through the dimensionality reduction of the data based on the periodicity of the data. In order to further improve the accuracy of long-term power load trend forecasting, the annual and monthly power load data are decomposed, Arima [6] method, which is used to study the law of data development, is integrated into BP neural network algorithm, and an improved bp-arima load trend forecasting model is proposed to realize the annual power load forecasting function under the comprehensive influence of multiple factors; the function nonparametric method is introduced to analyze the monthly data of the past years, and the periodical prediction of the future data is made by the function time series prediction model, and the component fusion of trend prediction and periodical prediction is carried out to get the new combined prediction model, so as to improve the accuracy of the long-term power load prediction.

Model construction
The BP-ARIMA forecasting model based on the functional nonparametric method decomposes the time series data, uses the BP-ARIMA model to forecast the trend of the annual load data of the time series data, and uses the functional nonparametric method to estimate the monthly load data. Through the component combination of the two types of data, the stability and accuracy of the long-term prediction of the model are improved. The process is shown in Figure 1.

Annual load forecasting based on BP-ARIMA
The model relates the regional power load with economic and social indicators such as regional GDP, industrial added value and disposable income [7] . Based on the coupling characteristics of each influencing factor, the fitting analysis of the data is adopted to clarify the accurate trend of the influencing factor data, and the BP neural network data training is carried out according to the trend of each influencing factor and the annual load data to obtain the annual power load trend forecast value. MATLAB data analysis software is used to compile the annual load prediction model of BP-ARIMA. The modeling steps are as follows: (1)The degree of correlation between influencing factors and annual load value was preliminarily tested by correlation test method.
(2)The annual load value is used to train the neural network of each influencing factor value，Based on the checked factor prediction data, the trend data of influencing factors are extracted, and the trend line can accurately judge the trend and numerical range of the data.
(3)The ARIMA model is used for trend prediction of various influencing factors, and the checked trend line is used to replace the original trend line of influencing factors. In this process, the correlation test can be carried out by trend line prediction, and the influencing factors with weak correlation can be eliminated, and the predicted value of factors can be output by polynomial fitting method.
(4)The annual load trend forecasting is carried out by coupling the influence factor data after checking with the annual load data.

Related factors of long-term load forecasting
From the existing research, scholars generally use economic and social indicators to analyze the impact on power load. As regional GDP, regional financial investment and regional population growth play a positive role in promoting regional power load growth, there is an inevitable relationship between regional power load growth and power infrastructure construction and regional economic and social index growth [8] . Based on the existing research, the relevant data are collected, sorted and analyzed from the perspective of economic development, social development and industrial production. The relevant influencing factors are shown in Table 1.

Monthly load forecasting based on functional nonparametric method
The regional monthly load data has the characteristics of periodic increasing. For this kind of time series data, the functional data is generated by taking the whole year (period T = 12) as the unit, and the random variable f is generated. The observation values of F from t = a to t = a + NT constitute a continuous time series f (T). Monthly load data can be regarded as repeated statistics by cycle t. The expression of functional data [9] is as follows: The selection of semi metric parameters is based on the derivative of the periodic curve, and the q value in ( ) , i i d x y affects the fitting degree of the curve. In the functional nonparametric method, the fitting degree of the curve is affected by the window width h ( ) in the kernel function, and the window width is automatically obtained by cross validation, That is: Remove the last n-th term from the observed ( ) , i i X Y data pairs, and analyze the data by functional regression based on the first 1 n  term. The minimum value of CV h ( ) is used to determine the value of h. The accuracy of the periodic function prediction is determined by comparing the predicted n-th data with the real load value. The comprehensive forecasting model combines the annual trend load forecasting with the monthly periodic load forecasting, and can allocate the weight of the monthly load and annual load through the average absolute percentage error (MAPE) weight method [10] , so as to improve the reliability and accuracy of the model forecasting. The weight distribution function of load forecasting is as follows:

Instance data
The data are shown in Table 1. In order to improve the accuracy of load forecasting, the monthly load indexes from 2006 to 2018 are introduced, as shown in Table 2.

The data analysis
The relevant general parameters of BP-ARIMA model and the difference parameters of improved ARIMA algorithm are shown in Table 3, where n is the highest power of polynomial in polynomial fitting method:  The correlation test was carried out between the influencing factors and the annual load value to screen out the influencing factors with strong correlation; The annual load value is used to train the neural network of each influencing factor value to check the trend line of each influencing factor; The verified ARIMA model can predict the other influential factors with strong correlation in the target year based on the characteristics of the data structure.
Through BP-ARIMA prediction model, the screened and checked influencing factors and annual power load are combined for multi factor prediction. The model is a kind of prediction based on the trend development of influencing factors and load series data. The annual regional power consumption prediction data is shown in Figure 2. In the figure, the predicted value of power load model is from 2019 to 2021, and the power load will continue to grow in the next three years, But the growth rate will be slowed down. The BP-ARIMA model and functional nonparametric method are combined to forecast, and the trend forecast value and periodic forecast value from 2016 to 2018 are combined with MAPE weight method. The MAPE value of the functional nonparametric prediction method is 1.17%, which is less than 1.93% of BP-ARIMA. After the component fusion of the two methods, the MAPE value of the new combined model is 1.59%. The error result is relatively ideal, and the load forecasting data is shown in Table 4. In order to reflect the prediction advantages of the new combination model, the grey prediction model GM (1,1), GM (1, N), BP neural network model and ARIMA model are respectively used to compare with the combination model. The error results are shown in Table  5. The new combination prediction model has the highest prediction accuracy, and the MAPE value is only 1.59%, which has obvious prediction advantages.

conclusion
This paper proposes a new combined forecasting model, which combines the trend of annual load with the periodicity of monthly load. The model not only considers the trend and periodicity of data structure, but also increases the rationality of data trend forecasting through the coupling of influencing factors, and greatly improves the accuracy of data forecasting. BP-ARIMA load forecasting model combines the nonlinear processing ability of BP with the linear forecasting ability of ARIMA, and improves the forecasting accuracy and data stability of influencing factors through trend line checking. The model can also screen out the influencing factors with poor correlation, and improve the stability of annual power load trend forecasting by using the coupling characteristics between factors. The functional nonparametric method can achieve the periodic forecasting of data by dimension reduction. The method can achieve good monthly load forecasting accuracy by selecting appropriate kernel function. The combined forecasting model can provide a more objective load forecasting method for long-term power load forecasting.