Forecasting of electricity consumption by industrial enterprises with a continuous nature of production based on the principal component method (PCA)

. Forecasting the consumption of electrical energy is important to improve its efficiency and, as a result, to improve the competitiveness of manufactured products by reducing the share of electricity costs in the cost of production. When determining the forecast indicators of electricity consumption by industrial enterprises, it is advisable to use modern high-precision forecasting methods that ensure the minimum value of the forecast error. Each enterprise must preliminarily determine, with the greatest possible accuracy, the amount and schedule of electricity consumption and then strictly adhere to them, thereby minimizing penalties and fines. The article deals with the issues of forecasting power consumption by industrial enterprises (on the example of a metallurgical enterprise) using the enlarged block diagram of the algorithm for predicting power consumption by the method of principal components (PCA) developed by the authors. Comparisons of the actual and predicted power consumption according to the developed model are made. The adequacy of the model is confirmed by small discrepancies between the actual and forecast data. This allows it to be used in determining the predicted values of power consumption parameters at ferrous metallurgy enterprises.


Introduction
An effective way to prevent conflicts between industrial enterprises and energy supply organizations, accompanied by additional costs of electricity and financial resources, is to increase the accuracy of forecasting the volumes and schedules of electricity consumption.The latter is ensured by the definition of these parameters for the contractual settlement period.In accordance with the Decree of the Cabinet of Ministers of the Republic of Uzbekistan dated January 12, 2018 No. 22 "On additional measures to improve the procedure for the use of electricity and natural gas", on the basis of agreements between electricity supply companies and consumers when consuming more than 5% of electricity in excess of the limit established by the agreement, a fine is levied by introducing a multiplying coefficient to the established tariff in the amount of 1.15.It should be noted that before the entry into force of this resolution (March 2019), enterprises paid significant fines for electricity consumption in excess of the established limit.This is primarily due to the lack of a perfect methodology for forecasting electricity consumption at industrial enterprises, i.e. the presence of a large error in the forecast indicators determined using existing methods.As noted above, penalties for electricity consumed in excess of the limit lead to an increase in the price of products and reduce their competitiveness.
An analysis of the forecasting methods used in industrial enterprises shows that they are mainly based either on an expert assessment of the volume of electricity consumption, or on accounting for electricity consumption per unit of output [4,5,7,8,10,12].
In order to improve the accuracy of predictive indicators, the authors of the article developed forecasting models using modern methods for predicting electricity consumption at industrial enterprises, providing minimal errors [10,13,15].
When developing mathematical models, information on power consumption at 100 protocol heats of a metallurgical enterprise was used as initial data according to the criteria for sampling primary data [16,17].
It is known that large errors in determining the predicted value of the object under study lead to additional financial costs.To reduce them, we have created a forecasting model using the principal component method [18,5,20].
This method is one of the methods of multivariate statistical analysis, which is used to reduce the dimension with the least loss of useful information.Reducing the number of variables in the initial data by excluding secondary ones from consideration makes it easier to operate with them, simplifies the creation of algorithms [18,19,8,20,21].
Thus, the idea of PCA is simple -to reduce the number of variables in the dataset without losing useful information.This is achieved by creating new, unrelated variables that continually increase variability.The search for such new variables is associated with the solution of a specific value / specific vector problem, and new variables are a priori determined by the dataset, not by PCA.Its solution makes the data analysis method more flexible and easily adaptable.For this, various methods and options have also been developed, adapted to data structures [22; 23].
The need to reduce the dimension of the initial data is explained by the following circumstances [24,25,26,27]: -the need for visual display of the initially selected data, which is achieved on the basis of their projection in three-dimensional space, a plane or a series of numbers; -the desire to simplify the studied models due to the need to facilitate the calculation and interpretation of the results; -the need to reduce the amount of stored data.The importance of reducing the dimension of the initial information is due to the following reasons [12; 13; 14, 15, 8, 9]: -low information content due to low variability in the transition from one object to another; -replacement of information due to the correlation of initial features; -the need for aggregation of initial features.
Principal component analysis has advantages over regression and Fourier analysis.It is simple and clear to use.The essence of the method is to transform a onedimensional series into a multidimensional one using a one-parameter shift procedure.The scope of this method is the study of a multidimensional trajectory based on the analysis of the principal components and the approximation of the series with respect to the selected principal components [7,10].
The main disadvantages of the principal component method are [5- 16,18,23,4]: -the need for frequent replacement of the regulatory framework due to its rapid aging; -the method allows you to work only with continuous data.
The principal component method was first used by Pearson in 1901 to solve problems of the best approximation of a finite set of points by straight lines and planes.For each k = 0, 1, ..., n-1, with a finite set of vectors, a task is compiled.The problem, in the formulation [8, 5, 3], is that with a minimum of the sum of squared deviations хi, from all k-dimensional linear manifolds find: (1) where -Euclidean distance from a point of a linear manifold.In this case, any k -dimensional linear manifold is specified as a set of linear combinations.In [17][18][19][20], the problem is solved as follows: , where the parameters run over the real line , -orthonormal set of vectors: ( To determine the reduced linear manifolds, an orthonormal set of vectors or a vector of principal components is carried out, as well as a vector: (5) That is (6) which is the sample mean: Accordingly, the principal component vectors are defined similarly in the form of optimization problems of the same type [24, 23, 16]: Data is centralized (by subtracting the average) In this case, the condition 2. The solution to the problem is the first principal component: (7) One of them is chosen as the final decision.
3. The projection on the first principal component is subtracted from the original data: ) ( min arg 1 ) , ( : As a solution to the problem at step 2k, the k-th principal component is determined [27]: One of several solutions is chosen as the final one. The projection onto the preceding component is subtracted at each stage (2k -1).It should be noted that, based on the solution of the optimization problem, it is easy to orthonormalize certain vectors.The calculation errors should not violate the mutual orthogonality of the vectors of the principal components, and vectors are included in the condition of the optimization problem.The presence of arbitrariness in the choice of the sign (and solve the same problem) and the most essential condition of data symmetry cause the ambiguity of the quantity.
In this case, the last principal component, the unit vector, is orthogonal to all previous vectors.When using the method of principal components, the following operations are performed [3][4][5][6][7][8][9][10]15]: 1.The initial data are replaced by linear manifolds of lower dimension; 2. An orthogonal projection with maximum dispersion is revealed, i.e. standard deviation from the mean; 3.For the initial data, orthogonal coordinates are created with the correlation coefficient equal to zero.
To solve the problem, an enlarged block diagram of the algorithm for predicting power consumption by the principal component method was compiled, Fig. 1.The following factors were chosen as initial data: Ф 1 -furnace load, t; Ф 2 -amount of shipped metal, t; Ф 3weight of loaded metal charge, t; Ф 4 -duration of metal smelting, min.;Ф5 -duration of work under current, min.
) ( min arg 2 ) , ( :  1. Contributions of principal components to the total variance of initial features Analysis of the selected and accumulated variance for each factor shows the percentage of total variance for each factor (in this example it is 5).As you can see, the first factor (value 1) is 69,65% of the total variance, factor 2 is 17.28%, and so on.The difference due to sequence factors is shown in Table 1.
Using the method of basic components, we obtain the primary matrix of weight loads of factors.In parallel, the matrix of eigenvalues of the correlation matrix should be analyzed.The weight loading matrix of factors looks like below: If the size of the property field is less than the two main components, then only the first two columns of the specified matrix are taken into account.
The dependence of the principal components on the centralized normalized initial characteristics has the following form: (12) (13) Next, the following matrix of factor loadings is analyzed to determine new features: At the same time, the initial characteristics and correlation coefficients of the main components characterize the correlation matrix of factor loads and form the basis of these calculations: Those initial data for which correlation coefficients are greater than 0.7 are highly dependent on the first principal component: Ф 1 -furnace load, t; Ф 2 -weight of shipped metal, t; Ф 3 -mass of the loaded metal charge, t.The second main component is closely related to the initial data: Ф 1 -furnace load, t; Ф 4 -duration of metal smelting, min.According to the main factors determined in this way, a mathematical model is built using linear regression, which has the following form:

4 .
As a solution to the problem, the second main component is determined.The second principal component is the solution to the problem: (9) Next, at step 2k -1, the projection onto the (k-1)-th principal component is subtracted [25-27]: (10)

Fig. 1 .
Fig.1.Enlarged block diagram of the algorithm for predicting the power consumption of the main components to (14), the predicted values of the object under study are determined and compared with the actual data (Fig.2).

Fig. 2 .
Fig.2.Comparison of the actual and forecast power consumption according to the developed model using the method of principal components In order to assess the adequacy of the forecasting model, the results of their modeling are compared with actual data.The results are shown graphically in fig. 3.

Fig. 3 .
Fig.3.Graph of the error between the actual and predicted values of power consumption according to the developed model using the method of principal components