Net demand short-term forecasting in a distribution substation with PV power generation

. The integration of renewable energies, specifically solar energy, in electric distribution systems is increasingly common. For an optimal operation, it is very important to forecast the final net demand of the power distribution network, considering the variability of solar energy combined with the variability of the electric energy consumption habits of population. This paper presents the methodology followed to forecast the net demand in a power distribution substation. Two approaches are considered, the net demand direct prediction, and the indirect prediction with the forecasts of PV power generation and load demand. Artificial Neural Network (ANN) based models and autoregressive models with exogenous variables (ARX) are used to predict the net demand, directly and indirectly, for the 24 hours of the day-ahead. The methodology is applied to a medium voltage distribution substation and the direct and indirect forecasts are compared.


Introduction
The renewable energies are power sources required to reduce the greenhouse effect and to solve the problems of fossil fuels supplying. One of renewable energy sources (RES) more used is the solar photovoltaic (PV), due to it is an unlimited, clean, ecological, dispersed and free energy. The PV energy is the conversion of sunlight into electricity through solar panels. However, the integration of PV power into the electric grids poses some difficulties. For an optimal operation of the power system it is fundamental to know the PV power generated in every moment, especially if the penetration of PV power is significant. This fact has propelled the development of short-term forecasting models for photovoltaic generation [1].
On one hand, the electricity market and the smart grid require accurate forecasting tools for carry out better demand side management tasks and more effective economic decisions. On the other hand, it is complex to predict the electric net demand due to different factors such as weather variables, social habits and seasonal characteristics [2].
According to the forecasting horizon, the forecasts can be classified in four categories: long-term (3 years-50 years), medium-term (2 weeks-3 years), short-term (hour, day or week) and very short-term (5 minutes-1 hour). The short-term load forecasting (STLF) is a decisive tool to ensure the balance between generation and demand [3] and to reduce the risk in decisionmaking power system planning and operational decisions [4]. Some researchers have published works describing short-term forecasting models applied to PV plants, load demand or net demand. They use different techniques for the forecasting model. Some of them include artificial neural network (ANN) based models or autoregressive (AR) based models [5,6].
The use of STLF has been identified as the proper tool to solve the problems derived from the integration of green (and very variable) energy sources into the electric power systems. In this context, the term of Net Demand (ND) is defined as electric load minus renewable generation. The ND prediction can be estimated of indirectly manner, that is, subtracting the renewable generation forecast to the electric load demand forecast; or can be forecasted directly. Bagheri et al. [7] carried out tests with the data of four weeks using five models: persistence, multilayer perceptron (MLP), radial basis function neural network (RBFNN), wavelet neural network (WNN) and three-phase Cascade Neural Network (CNN).
The comparison of the forecasting results led to the conclusion that the direct ND forecasting achieves lower errors than the indirect forecasting. They applied their methodology to practical power networks (regional or national power systems) like those of Alberta (Canada) or Ireland.
To this regard, van de Meer et al. [8] examined the difference between direct and indirect ND forecasts using static and dynamic Gaussian Processes (GP) on a set of residential consumers (300 customers) with a halfhourly basis. The authors concluded that there is no single best method (direct or indirect) that can be applied on any location and under any circumstance and they evaluated the performance of different models.
In this paper we define and apply a methodology to predict the ND, in a direct or indirect manner, using ANN based models and autoregressive with exogenous variables (ARX) models for an electric distribution substation with significant PV power penetration. This paper is structure as follows: in section 2 the methodology is described; section 3 presents the results obtained with the proposed methodology and finally, the conclusions are presented in section 4.

Direct prediction of net demand
We define the ND at the connection point of the electric substation as the difference between the load demand (electricity consumption) and the generation of PV systems connected to this substation.
Three time series with mean hourly values can be identified: the load demand, the PV power generation and the net demand. As explained above, the availability of two of the time series is sufficient, as the third series can be calculated as sum or difference of the other two. Figure 1 plots an example with 14 days of the three timeseries. The three electric time series (ND, PV power and load demand) need to be complemented with other series corresponding to possible explanatory variables. The most used explanatory variables are weather variables, because there is a direct relation between them (for example between solar irradiance and PV power, or temperature and load demand). In order to develop an operational forecasting model for the next day, the explanatory variables must correspond to forecasts of the most important weather variables. These forecasts can be obtained from Numerical Weather Prediction (NWP) models, which are models that estimate the future values of the weather variables from initial conditions by solving the equations that govern the dynamic behaviour of the atmosphere. Usually NWP models work with grids of points, and the weather forecasts from the location of interest must be obtained by interpolation of the forecasts for the points of the analysis grid nearest to the location of the PV plant and consumers.
In our work, we predict the ND for the 24 hours of the next day using forecasting models based on two different techniques: ANN and ARX. The selected ANN model is the multilayer perceptron (MLP) with one hidden layer, since despite its simplicity, it is a model that behaves effectively in most of forecasting applications. In order to select a proper forecasting model, MLPs with different number of neurons in the hidden layer must be trained: the available data for the training process is randomly divided into two sets, the first one with 80% of the data used as the training data set, and the second with the resting 20% used as a validation data set. The different MLP models are trained using the back-propagation with momentum algorithm stopping the iterations (epochs) when the error with the validation data set begins to increase. The selected MLP model (with its specific number of neurons in the hidden layer) is the one that achieves the lowest average error with the complete set (training and validation data sets).
The Root Mean Square Error, RMSE (1), and the Mean Absolute Percentage Error, MAPE (2) were used to assess the forecasting error (difference between real and forecast value).
where ND forecast.i is the ND forecast in the hour i, ND actual.i is the actual value of ND in the hour i, and n is the total number of cases (hours). The direct forecasting model predicts future ND in the substation for the hour h of the next day (d+1). The input variables for the MLP and ARX models are: past values (ND in hour h of day d-1 and of day d-7), seven dummy variables to code the day d+1 (Monday, Tuesday-Wednesday-Thursday, Friday, Saturday, Sunday, regional or national festivity, and local festivity), and forecasted values for hour h of day d+1 of the selected weather variables (temperature, global horizontal irradiance, pressure, wind speed, wind direction, relative humidity, cloud fraction and rainfall) [9].

Indirect prediction of net demand
In this case, we predict the ND as the difference between the forecast of the PV power and forecast of the load demand. The forecasting models should be developed with the same technique as those used for the direct prediction in order to compare results. Therefore, MLP models and ARX models have been applied in both cases (direct and indirect prediction). Load Demand   0  15  30  45  60  75  90  105  120  135  150  165  180  195  210  225  240  255  270  285  300  315  330   6   5   4   3   2   1 0 Hour Therefore, two forecasting models must be developed for the indirect prediction of the net demand. The first forecasting model predicts the hourly PV power value for hour h of the day-ahead (d+1) and it is based only on forecasts of weather variables for hour h of day (d+1). The second forecasting model, which predicts hourly load demand for hour h of day (d+1), is based on past values (load demand in hour h of days d-1 and d-7), type of day (the seven dummy variables explained above) and forecasted values for hour h of day d+1 of the selected weather variables.

Case study
The proposed methodology has been applied to the data corresponding to an electric substation 66/13.2 kV located in north of Spain, which feeds around three thousand electric consumers (domestic, commercial and industrial loads) and connects to the grid a PV plant with a capacity near 2 MW. The PV plant is composed of approximately 150 two-axis solar trackers. The available data correspond to the time series of hourly PV power generation and hourly ND for 30 months. These series have been completed with forecasts of the set of selected weather variables obtained with an NWP model for each hour included in the time series. The weather forecasts corresponded to the values obtained from the NWP model at the first hours of the previous day (around 6:00 a.m.). The forecasts for the four nearest points of the grid of analysis used by the NMP model were interpolated to obtain the weather forecasts for the location of the electric substation (very near to the consumers).
The data was initially divided into two sets, the first one corresponded to the first 24 months and was used as training data set (for the MLP models) or adjustment data set (for the ARX models), and the second one included the data of the last six months that was used as testing data set. Table 1 shows the number of neurons in the hidden layer for the selected MLP forecasting model. As it has been mentioned in the section 2, we trained MLP models with neurons in the hidden layer in the range 3 to 40, selecting that model which achieved the lowest RMSE with all the training data set.  Table 2 shows the variables used in the selected ARX model. As it has been mentioned in section 2, the PV forecasting ARX model only uses forecasted values of weather variables, however the load demand forecasting model and the ND forecasting model use, in addition, past values and seven dummy variables. It should be noted that some of the selected input variables have influence on the output of some of the models (x) but not in others (0).  Table 3 shows the results for the forecasting error obtained, using all the testing data set, with the different forecasting models and with the two strategies of prediction analysed (direct and indirect forecasting). As show in this table, the forecasting model with the lowest RMSE and MAPE is a MLP. In addition, the results of the Table 3 show that the better strategy for the ND prediction is the indirect method.  Figure 2 and Figure 3 show the hourly forecasts and actual values for net demand in a week of autumn 2010 (from Sunday to Saturday), using the direct and the indirect methods with the selected ARX model. Figure 4 and Figure 5 show the hourly forecasts and actual values for ND in the same week of autumn, using the direct method and the indirect method and the selected MLP model. Figures 1-5 show that indirect prediction with the MLP model for the ND (Figure 5) is the method that achieves the best results.

Conclusions
A forecasting methodology for the net demand in an electric distribution substation has been presented in this paper. The correct prediction of the ND for the dayahead is important in order to operate the distribution network and to participate in the electricity market. Two different forecasting methods are explored: the first one, the direct prediction of the ND, and the second one, the indirect prediction using the forecasts of the PV power generated and of the power consumption. Both methods use, as explanatory variables, previous values of the predicted variable, weather forecasts and dummy variables, and both use the same techniques to build the forecasting models (ARX and MLP based models). This methodology is applied for an actual electric substation which feeds different types of electric consumers and connects a PV generation plant to the electric network. The results show that indirect prediction is more adequate for this task.