The Long-Term Optimization Model of Pumped-Hydro Power Storage System Based on Approximate Dynamic Programming

Based on the hypothesis that pumped storage power station is available for multi-day optimization and adjustment, the paper has proposed a long-term operation optimization model of pumped-hydro power storage (PHPS) system based on approximate dynamic programming (ADP). In this multistage decision model, across the stages, value function approximation (VFA) of the reservoir energy storage was used to keep the overall optimization characteristics; during the stages, generated energy & generating periods, and electricity consumption for pumping & pumping periods are used as decision variables to conduct daily optimization operation. The paper got the approximate optimal solution through iteration solution decision variable and value function so as to avoid “curse of dimensionality” in conventional multistage decision model. According to the experiment, the ADP-based model can accurately describe the long-term operation modes of pumped storage power station, and its calculation methods are more appropriate for this kind of large-scale optimized decision problem than conventional mathematic planning methods.


1Introduction
PHPS system featured with large capacity, high efficiency and flexible start-up is a typical energy storage facility. Except good peak-load regulation and reserve capacities, differences in peak and valley electricity price can be utilized to generate economic benefits through the pumping-generating circle.
In general, a PHPS system operates by using the excess power generation at times of low electrical demand to pump water to a reservoir at a higher elevation. When there is a higher demand for power, water is released back into the lower reservoir through a hydraulic turbine which generates electricity that can be run through the grid to satisfy the peaks during high load demand.
With the increase of power grid operation demands and the improvement of engineering technology, more larger-scale pumped storage power stations will be constructed and more and more attention will be paid to optimizing the operation of those stations [1][2][3][4]. Chen Xueqing analyzed the peak load shifting effect of the PHPS station in [1], while Wang Minyou analyzed and estimated its static and dynamic benefits in [2]. Li Wenwu established a long-term operation model which took random inflow process into consideration, and then discussed this model with the help of stochastic programming method in [3].Qiu Wenqian [4] firstly set up a daily optimizing model for pumped storage power stations, and then proposed a Dynamic Programming (DP) model for the multi-day optimizing operation on the basis of daily optimizing model mentioned above to provide reference for the optimizing operation of such pumped storage power stations.
Bellman's Principle of Optimality guarantees the optimization characteristics after staged decomposition, but it is easy to produce "curse of dimensionality" during accurate DP recursion. Professor Powell from Princeton University summarized the research results in stochastic programming and DP, and proposed the modeling framework and solution ideas of Approximate Dynamic Programming (ADP) [5], so as to deal with the programming problems involving determinacy & indeterminacy and continuity & dispersion. ADP To solve the problems in stochastic dynamic programming effectively, it is necessary to realize random variables before staged decision-making based on multi-stage decision model and use approximate structure to approach the function of accurately calculated value in DP. The effective solution of many problems such as dynamic resource optimization [6], large-scale locomotive dispatching [7] and energy planning [8] indicates the application prospect of ADP.
With the help of ADP, this paper carried out staged modeling for the long-term operation of pumped storage power stations. In order to avoid the "cures of dimensionality" during multi-stage optimization, the solution was obtained by adopting a value function approximation strategy. Through the optimizing simulation of the operation plan for the PHPS, this paper analyzed the stability, optimization effect and computational efficiency of the ADP modeling and solving method, and the relationship between a PHPS and the system's power generation cost, power source structure and load features, thus providing references to determine reasonable scale and operation plan of PHPS stations within the grid.
The paper is organized as follows: it firstly models the daily and multi-day optimal operation on the basis of the economic benefits from PHPS. Secondly,according to ADP, uses value function approximation method for solving the "curse of dimensionality"problem of multi-day optimal operation based on DP. At last, gives the numerical results and briefly analyzes the characteristics of the model. And final section gives a summary.

2.2Constraint conditions
Energy balance equation: Where, t R is the power energy stored in the upper reservoir at the beginning of t; i  is the pumping-generating conversion efficiency of unit i. The constraints of reservoir storage: Where, min R and max R are bounds on minimum and maximum power energy stored in the upper reservoir; ini r and end r are initial and end storage energy of upper reservoir. The constraints of reversible unit: A hybrid pumped storage unit is treated as a generator and a pumper. And then we introduce some 0-1 variables to represent the units status and establish the accurate optimization model for reversible units on the basis of unit combination model in [9]. , , , , , , Equality constraints (5)-(6) are generator and pumper capacity constraint. Inequality constraints in (7) represent that the hybrid unit can work only in on condition at a time. Equality constraints (8)-(9) ensure that a unit cannot turn on and turn off in the same time. Equality constraints (10)-(11) are minimum on and off time constraints.
The model employed above is a MILP for a 24-hour period. Solution is based on a commercial software package (GAMS/CPLEX [10], [11]).

3ADP-based solution 3.1ADP-based Operation Optimization Model of PHPS
ADP uses the idea of approximate substitution to solve the "curse of dimensionality" related to state variable, decision variable and random variable in DP and SDP. Its main viewpoints include: 1) DP's concept and symbols are used to describe the programming problems, and Bellman's Principle of Optimality is followed for optimization in staged decision-making.
2) The optimization strategy is open to the methods in other fields,such as random search, simulation and optimization, reinforcement learning, which can be used to select decision strategy. In this way, ADP's range of application is expanded.
ADP provides a powerful framework for solving DP and SDP problems. After the modeling by DP's method, a better decision sequence is obtained by iteration and update.
For considered DP Model of PHPS system, we define state space S j as follows: This has to be solved subject to two sets of constraints. The first set governs decisions made at a point in a day, which is given by Equations (2)(3)(4)(5)(6)(7)(8)(9)(10)(11).The second set of equations are the transition equations that link activities over time .
As a dynamic program, we solve decision problems for each day j We first define the value Vj (Sj) of a state Sj as the sum of the contributions that we expect to make, staring in state Sj ,if we act optimally until the end of the time horizon. Bellman's equation enables us to recursively compute the optimal value functions associated with each state,as shown in (14).

3.2Value Function Approximation
Suppose now that we want to estimate a function V (R) that gives the value of having R resources. It is easy to know that V(R) is concave in R and using separable,piecewise linear functions. It is important to recognize that we are only concerned with the derivative of ( ) V R rather than the actual value. ,where M is the stages number.
According to the theories of stochastic dual dynamic programming [13][14][15], the method for updating the sample of value function is used to update the approximation function. The samples are obtained and updated as follows: With respect to the pondage t R , the function of state transition derives (15) The equation (13) indicates that the derivative of the approximation function with respect to the pondage t R before decision-making is equal to its derivative with respect to the pondage 1 t x R  after decision-making during the previous period of time. The equation (14) indicates that the derivative of the approximation function with respect to t R is equal to the pondage's marginal value, which can be obtained by solving the balance formula of the pondage. The equation (15) is used to update the approximation function of the nth iteration during the period of time t-1.
After piecewise linear approximation is applied on the value function of reservoirs the equation (12) is transformed into the decision solution of the equation (16). In other words, it becomes a problem concerning optimal allocation of resources during the period of time, which is small in size and east to be solved. Through the iteration between decision and value function, the approximate optimal decision sequence and the approximate optimal solution are solved.
4Results and discussion

System description
Data of calculation example including load and electricity generation cost is from [4].The maximum load is 2850 MW and the annual average daily load rate is 83%. Besides, it is assumed that there is only one PHPS station within the system with two hybrid pumped storage units. The unit's pumping power is 100MW and generating power is 200 MW. The pumping-generating cycle efficiency is 0.7. And the maximum power energy stored in the upper reservoir is 2000MW. was set for other hydroplants.

Result analysis
Proposed ADP and MILP methods are used to obtain results of the multi-day optimal operation of PHPS system. And the result comparison is shown in Table 1.
Computational accuracy of DP model depends on the discretization degree of energy storage capacity. As only one power plant was taken into account now and its curse of dimensionality still remained unclear, the results of the DP method could be treated as the optimal reference bases. In addition, as for the MILP method, its computational efficiency was determined by the calculation performance of CPLEX; however, for complex MILP problems with long periods, the computation time and accuracy of such a method were not satisfactory in many cases. According to Table 1, although ADP is just an approximate method, it can achieve a better optimization effect compared with using DP method to determine the power generation pumping period by human because the model within stages adopted in this paper optimizes both electricity generation and pumping duration.  Figure 3 shows the contrast between the original load curve of a certain day and the comprehensive load curves obtained through computation based on the three methods. We can see that a PHPS system can not only achieve economic benefits but also play a peak clipping and valley filling role on network load so that the average load rate is increased from 83% to about 88%. In the meantime, in comparison to selecting fixed generating periods and pumping periods mentioned in [4], optimizing generating periods and pumping periods by ADP has a more obvious smoothing effect on load curves. Suppose that the hybrid pumped storage unit was added one by one. Figure 4 indicates the comparison of computation time required by these three methods. With regard to DP and MILP methods, their increases showed a non-linear relation; nevertheless, the relationship of ADP computation time was approximately linear, indicating that the staged solution of ADP can achieve a time decoupling effect.  In Table 2, as there is no big difference between the effects achieved by 3 or 2 units, it is clear that the pumped storage power station in the system has a limited saturated scale and that the pumping-generating cycle efficiency cannot be improved by increasing the capacity of such a station when the limitation is reached.

5Conclusion
The following conclusions were reached through calculation and analysis of examples: 1) The approximate dynamic programming method is fit for the modeling and solution of multistage decision problem, and value function approximation method has good and stable optimization characteristics for energy storage related problems.
2) The approximate dynamic programming method has decomposition characteristics. It is available for multistage solution so as to reduce problem scale and improve solving speed. So it is suitable for the solution of large-scale optimized decision problems.
3) The saturation scale of pumped storage power station in the system is limited. After reaching the limiting value, the addition of capacity of the pumped storage power station can't increase the pumping-generating circulation benefits. Thus, the construction scale of pumped storage power station in the system should conform to the load characteristic and power supply of the system.