Two-population Asymmetric Evolutionary Game Dynamics-based Decision-making Behavior Analysis for A Supply-side Electric Power Bidding Market

. This paper systematically discusses two-population asymmetric evolutionary games (2PAEGs) from the perspective of decision-making behavior characteristics, and applies these game models to a two-population supply-side electric power bidding market. First, a 2PAEG model is established. Then, complete evolutionary equilibrium rules of this model are revealed during decision-making processes. Discussion shows that final evolutionary game equilibria achieved in the 2PAEG model are only determined by some payoff parameters, which are defined as relative net payoff (RNP) parameters in this paper. Finally, a case study of supply-side bidding simulation for two generator populations is conducted, which can effectively verify the universality and effectiveness of the evolutionary dynamics results obtained in the established general 2PAEG model. Moreover, it shows that reasonable policies made by the government can guide more appropriate power bidding for onto-grid electricity.


Introduction
Recently, the energy and electric power systems (EEPS's) have developed rapidly around the world, such as energy interconnection system, ubiquitous Internet of things in electricity, and integrated energy systems. Under this background, more and more countries are focusing on the construction and development of smart grids (SGs) [1]. In this context, electricity markets (EMs) are also developing rapidly throughout the world, such as America Pennsylvania-New Jersey-Maryland (PJM) EM [2], Beijing Power Exchange Center in China [3]. Currently, how to determine an optimal strategy for decision-making agents in an EM is still a challenging topic [4], [5]. This is also promising, because the solution of this topic can effectively balance and optimize the benefits of all sides in an EM. Traditional optimization theory system, which is characterized by single-agent decision-making, cannot better to address this topic. To this end, more and more scholars try to investigate this area combining with game theory (GT). GT has gradually become a powerful mathematical tool to solve multi-agent decision-making issues in a competitive EM.
GT was first founded by Von Neumann and Morgenstern in 1940s [6], and now it has been generally used for decision situations wherein one party is in conflict with another (i.e., a two-party game) or others (i.e., a multi-population game). GT can be divided into three major categories, namely cooperative game theory, non-cooperative game theory, and evolutionary game theory (EGT). Among these, EGT is gradually treated as a powerful mathematical tool to investigate the decisionmaking behavior characteristics of multiple populations in a long-term evolution process.
EGT is developed based on natural selection mechanisms with strict bounded rationality assumptions. Hence, it is more suitable for solving practical decision issues, and can better describe the spontaneous evolution process of different populations. Actually, a general multi-population evolutionary game (MPEG) model can provide more direct and convenient theoretical reference for the application of EGT tools. Currently, researchers have preliminarily investigated such issues from different angles. For example, Wang et al. [7] use an evolutionary game approach to analyze the bidding strategies in an EM with elastic demand, where the generator companies (GENCOs) are represented as different populations in the coevolutionary algorithm to search the equilibrium. Bahmani-Firouzi et al. [8] propose a scenario-based market clearing model to achieve optimal bidding strategies of GENCOs in the incomplete information EM. Vijaya Kumar and Vinod Kumar [9] develop a Shuffled Frog Leaping algorithm for the generation bidding strategy in a pool based EM. On the whole, the power bidding of GENCOs in an EM is a process of long-term dynamic evolution with bounded rationality and incomplete information. Obviously, EGT is a better mathematical tool to solve the issues of optimal bidding strategy for the GENCOs in an EM.
For this motivation, we first systematically investigate the complete evolutionary dynamics of a general two-population evolutionary game (2PAEG) model in this paper. Based on this, we then try to use it to solve the issues of power bidding for the supply-side GENCOs in a competitive EM.

Several important concepts in EGT
1)Population. The dynamics models in EGT can be classified into monomorphic population dynamics model and polymorphic population dynamics model. The former refers to only one population in which individuals have the same pure strategy playing in a symmetric game. The latter involves multiple populations and meanwhile, individuals from different populations may have different pure strategies playing in an asymmetric game. In particular, in an asymmetric game, participants have asymmetric information to the preferences of each other [10].
2) Replicator dynamics (RD). RD assumes that the growth rate of a strategy is given by its average payoff (i.e., the fitness function) [11]. Here, set siS is a pure strategy of individual i, where i={1, 2, … , n} and S={s 1 , s 2 , … , s n }, x i (t) represents the number of individuals who choose s i at time t, where x={x 1 , x 2 , … , x n }, and f i (s, x) denotes the fitness function of individuals who choose s i at time t. Then, the discrete-time and continuous-time RD models are presented as follows. ( 3) Evolutionarily stable strategy (ESS). It is used to describe such a strategy that is adopted by most individuals in a population in one game and meanwhile, any mutation strategy cannot invade this population. Obviously, ESS has the highest stability in a defined strategy set S. Here, assume that strategies s and s belong to S, and then for any s(s≠s), s will become and ESS if there is always a positive number ε(0, 1) that makes the fitness functions of s and s meet ( , where ( ) f  represents the fitness function, and εs+(1ε)s denotes a mixed strategy that may be a mutation strategy. 4) Lyapunov stability theory (LST). LST is generally used to analyze the asymptotical stability of an internal equilibrium point of the RD equation(s). Here, the ESS of the whole MPEG system is determined by the asymptotically stable state of its RD equation(s) [12], [13]. According to LST, if the eigenvalues of Jacobian matrix of the RD equations all have a negative real part at an internal equilibrium point, then such point will be asymptotically stable, and an evolutionary game equilibrium (EGE) can be achieved at this point.

Establishment of a general asymmetric 2PAEG
1) Payoff distribution matrix. In an asymmetric 2PAEG, two populations, denoted by A and B, separately have two incompatible pure strategies in their strategy set. Concretely, A has two pure strategies to choose, i.e., A 1 and A 2 , which means participate in and does not participate in a game, respectively, and they are selected by individuals in population A with a proportion of x and 1-x. Similarly, B 1 and B 2 are two incompatible pure strategies of population B with a selection proportion of y and 1-y in population B in a game, respectively. Then, the payoff distribution matrix of such a general asymmetric 2PAEG is presented as follows: where a, b, c, d, e, f, g and h are illustrative payoff parameters.
2) RD equations. Based on (3), and referring to [12], the RD equations of this general asymmetric 2PAEG are where γ 1 =a-c-e+g, γ 2 =c-g, γ 3 =b-f-d+h, and γ 4 =f-h. Further, the Jacobian matrix of (4), i.e., J1, and its determinant and trace, denoted by det(J1) and tr(J1), respectively, are as follows 10 where γ 10 (y)=γ 1 y+γ 2 and γ 11 (x)=γ 3 x+γ 4 . Here, a-e and γ 2 are defined as relative net payoff (RNP) parameters of population A when it chooses strategy A 1 while population B selects strategies of B 1 and B 2 , respectively; and b-d and γ 4 are RNP parameters of population B when it chooses strategy B 1 while population A chooses strategies of A 1 and A 2 , respectively.
According to Fig. 1, we can further conclude that the complete evolutionary dynamics of the general asymmetric 2PAEG contains 16 game situations, which are only determined by four groups of RNP parameters, i.e., γ 2 , γ 4 , γ 5 , and γ 6 , where γ 5 =a-e, and γ 6 =b-d. Based these four groups of RNP parameters, the complete evolutionary dynamics statistics of the general asymmetric 2PAEG can be revealed as demonstrated in Table 1, where ×, ♥ and ○ signify an unstable, stable (i.e., EGE) and saddle point or center, respectively. Obviously, from Table 1 and Fig. 2 we can conclude that, i) (x * , y * ) is always a saddle point (or a center); ii) 48 game scenarios are formed totally, where a saddle point or center appears; iii) 16 game scenarios are formed totally, where an unstable point appears; iv) 16 game scenarios are achieved totally, where the general asymmetric 2PEGM can achieve an EGE, i.e., the whole game system can finally reach evolutionarily stable; and v) among these EGEs, an EGE can be achieved at (0, 0), (0, 1), (1, 0) and (1, 1) at four times, respectively. To observe the evolution characteristic of decisionmaking behavior in each game situation presented in Table 1, we simulate the phase trajectory of (x, y) in each game scenario, as demonstrated in Fig. 2. In this figure, Cases 1 to 16 represent the game situations 1 to 16 in Table 1, respectively. Overall, this general asymmetric 2PEMG can have 16 times totally to achieve an EGE eventually. Moreover, each EGE is only determined by four groups of RNP parameters, i.e., γ 2 , γ 4 , γ 5 , and γ 6 . Therefore, we can appropriately adjust these RNP parameters to make this general asymmetric 2PAEG achieve some expected rational EGEs during the process of evolution. For example, the government can formulate some supervision policies such as rewards and punishments to make the equilibrium results of the competitive EM toward a reasonable operation direction. To verify this, a case study of supply-side bidding simulation for two generator populations is conducted as follows.

Case study 3.1 Supply-side power bidding model in China
Taking China as an example, where GENCOs are independent economic entities and they participate in competition in an EM through the power bidding. Here, the process of power bidding is generally a repeated game, in which the market clearing price (MCP) is the final market equilibrium result. Hence, EGT is suitable for investigating the equilibrium of such a long-term bidding development process. Based on Section II, we take two GENCO populations as research objectives in this case study, which are denoted by population A and population B, respectively. Actually, this can be extended to more than two GENCO populations, while just increasing the complexity of model calculation and analysis. To this end, we consider a two-population GENCO power bidding model in the case study. Here, for each GENCO population, the cost function of generator j, denoted by C(P j ), is generally presented as 2 ( ) , 1, 2, , where nʹ is the total number of generators in a population, a j , b j and c j are cost coefficients, and P j is the generating capacity of generator j. Further, the profit function of generator j, denoted by P(P j ), is presented as follows: where B R is the MCP. Therefore, the bidding strategy of generator j is to maximize P(P j ) as follows where Q is the total demand of the market (MW), B j (S j , P j ) is the bidding curve of generator j that is provided to the trading center, and S j is the bidding strategy set of generator j.

Simulation analysis
In above-proposed model, we assume that GENCO populations A and B have sufficient bidding units in a certain regional EM, and they have different sizes of generating capacity in this EM. Overall, the parameters of A and B are demonstrated in Table 2. Moreover, assume that populations A and B both have two strategies, S H : high bidding strategy, and S L : low bidding strategy, which are determined by the upper limit and lower limit of the same capacity segment, respectively. Concretely, S H signifies that GENCOs declare high prices permitted by market rules to obtain high returns from high clearing prices, and S L indicates that GENCOs report relatively low prices hoping to increase power generation amount to network to obtain high returns. Additionally, assume that the maximum quotation of EM is 400 CNY/MWh, and A and B can provide generation capacities of 360MW and 880MW, respectively, and they both quote based on five capacity segments, as demonstrated in Table 3.   Here, (a, b), (c, d), (e, f) and (g, h) are not necessarily assumed as common knowledge for GENCO populations A and B, and they are obtained as (a, b)=(32432, 97177.6), (c, d)=(20628, 36400), (e, f)= (17656.2, 79962.4), and (g, h)=(11031.8, 27745.6), unit is CNY/h. Among these, for example, a is calculated as follows: a=295180-(6700+110180-0.18180 2 ) =32432. Obviously, calculation results show that this case is a typical asymmetric 2pEGM. Therefore, according to (4), RNP parameters of this asymmetric supply-side power bidding evolutionary game system are obtained as follows, γ 1 =a-c-e+g=5179.6, γ 2 =c-g=9596.2, γ 3 =b-f-d+h=8563.9, and γ 4 =f-h=52219.85. Thereby, its RD equations are presented as 1 where x and y represent the proportion of generators choosing S H in populations A and B, respectively.
Obviously, due to x, y[0, 1], x=0 or 1, and y=0 or 1 are the solutions of (9). Since , GENCO population A will achieve an EGE at x=1; and similarly, GENCO population B will achieve an EGE at y=1 due to . Then, (x, y)=(1, 1), i.e., (S H , S H ) becomes an EGE in populations A and B, which can resist any mutation strategy. This indicates that all GENCOs in A and B will tend to adopt S H , the high bidding strategy, during strategy evolution. Hence, aiming at (9), we simulate 6 cases as demonstrated in Fig. 3, where Cases 1 to 6 represent 55, 1010, 1515, 2020, 2525 and 3030 times of different initial points of (x, y) within [0, 1][0, 1]. Simulations show that A and B will both choose S H finally; hence, (S H , S H ) becomes the unique EGS in this asymmetric two-population supply-side bidding evolutionary game. The simulations demonstrated in Fig. 3 are completely consistent with above theoretical analysis and the conclusions drawn in Section II.
Certainly, to guide the supply-side bidding in the EM to develop more rationally, (0, 0), i.e., (S L , S L ) can be made to be the unique EGE by adjusting current market rules appropriately by the government. At this point, low bidding strategy (i.e., base price bidding) become evolutionarily stable while high bidding strategy will turn into unstable strategy and gradually disappear after a long-term evolution. Therefore, reasonable bidding rules can guide more appropriate power bidding for onto-grid electricity. To this end, according to Table 1, we can adjust χ 2 , χ 4 , χ 5 and χ 6 to make (0, 0) become the unique internal stable equilibrium point. For example, let γ 2 =c-g<0, γ 4 =f-h<0, γ 5 =a-e<0, and γ 6 =b-d>0, thereby reporting low bid price simultaneously (S L , S L ) becomes the unique EGE in this two-population supply-side power bidding evolutionary game system, as demonstrated in Fig. 4, where Cases 1-5 respectively represent 55, 1010, 2020, 3030, and 4040 times of different initial points of (x, y) selected from region [0, 1][0, 1], with simulation time t [0,5]. Simulation results show that (S L , S L ) becomes a unique ESS. At this point, GENCO populations A and B finally tend to choose low bidding strategy S L simultaneously to minimize their payoffs during power bidding in an EM. Fig. 3. Illustration of GENCO populations A and B finally tend to adopt the high bidding strategy SH during strategy adjustment when no governmental supervision is implemented. Fig. 4. Illustration of GENCO populations A and B finally tend to adopt the more reasonable low bidding strategy SL during strategy evolution when effective and reasonable governmental supervision is implemented.

Conclusion
Based on the principles of replication, selection and mutation during population evolution, EGT provides a more reasonable tool for studying the decision-making behavior of stakeholder populations with bounded rationality and imperfect information in practical issues. In this paper, we systematically investigate and summarize the complete evolutionary dynamics of general 2PAEGs. We find that there are 48, 16, and 16 scenarios totally where a saddle point, unstable point, and asymptotically stable point will appear, respectively. Among these, (0, 0), (0, 1), (1, 0) and (1, 1) all have four times to reach evolutionarily stable. Moreover, the final EGE achieved in the general 2PAEG is only determined by some groups of RNP parameters.
Further, we conduct a case study of a two-population supply-side bidding game among GENCO populations. The simulation results show that GENCO populations with two different bidding strategies all finally tend to choose the high bidding strategy, i.e., (S H , S H ), to maximize their benefits. However, this is not conducive to the sound operation of the entire bidding market in a long-term development period. Therefore, according to the RNP parameters in this power bidding evolutionary game system, the government can formulate some supervision policies such as rewards and punishments to appropriately adjust the RNP parameters to make the low bidding strategy more beneficial and stable after a longterm evolution, i.e., (S L , S L ) gradually becomes evolutionarily stable while (S H , S H ) gradually becomes unstable and finally disappear in the entire evolutionary game system after a long-term development.
In the future, we can combine artificial intelligence (AI) techniques such as machine learning methods (e.g., reinforcement learning, deep learning, transfer learning, assemble learning) [14] with EGT to solve the complex behavioral decision-making issues of multi-stakeholder populations with bounded rationality and imperfect information in the engineering field, such as the field of power bidding in the EM. Besides, the evolutionary game games on complex systems can also be considered in the next step.