Flood frequency analysis for annual maximum streamflow using a non-stationary GEV model

Under changing environment, the streamflow series in the Yangtze River have undergone great changes and it has raised widespread concerns. In this study, the annual maximum flow (AMF) series at the Yichang station were used for flood frequency analysis, in which a time varying model was constructed to account for non-stationarity. The generalized extreme value (GEV) distribution was adopted to fit the AMF series, and the Generalized Additive Models for Location, Scale and Shape (GAMLSS) framework was applied for parameter estimation. The non-stationary return period and risk of failure were calculated and compared for flood risk assessment between stationary and non-stationary models. The results demonstrated that the flow regime at the Yichang station has changed over time and a decreasing trend was detected in the AMF series. The design flood peak given a return period decreased in the non-stationary model, and the risk of failure is also smaller given a design life, which indicated a safer flood condition in the future compared with the stationary model. The conclusions in this study may contribute to long-term decision making in the Yangtze River basin under non-stationary conditions.


Introduction
Frequency analysis plays an important role in hydrological modelling and has been applied in calculation of design values for dams and many other hydraulic constructions. The GEV distribution, a commonly used distribution for handling extreme events, has been widely used in frequency analysis of flood and drought all around the world.
Generally, frequency analysis assumes that the series are independent identically distributed (i.i.d), however, the stationary assumption has faced more and more challenges with the intensify of climate change and human activities [1] To have a better understanding of the dynamic hydrological system, various studies have been conducted to predict future events under changing environment. For example, [2] used a regional Mann-Kendall test to examine the non-stationary behaviors in AMF series for 491 catchments in Australia. [3] used a log-Pearson III distribution for non-stationary extreme value analysis of flood peaks in the United State. Although there are criticisms that it may lead to underestimation of variability, uncertainty and risk when used carelessly [4], non-stationary modelling framework can nevertheless bring us a new perspective into hydrologic modelling, thus providing relevant department with valuable information to hydraulic engineering design and flood prevention.
Return period and risk are very important in hydrologic design. However, in terms of non-stationary, it may conceal the actual meaning of underlying mechanism and even lead to misconceptions. It is urgent to make some adjustments to accommodate nonstationary conditions. [5] presented the method how to extend stationary return period into non-stationary framework. [6] suggested that the risk of failure over a design life may provide more coherent results for risk assessment and communication. There have been some progress on the theoretical methods. Yet, researches on practical design of flood event in terms of non-stationary are still not enough.
This paper aims to evaluate the effect of nonstationarity on flood frequency analysis in the Yangtze River. To this end, the Yichang station in the middle Yangtze River was taken as a case study, a nonstationary GEV model was established to account for non-stationarity; the non-stationary return period and risk of failure were calculated and compared for flood risk assessment.

Study area and data
The Yangtze River flows from the Qinghai-Tibet Plateau to the Chongming Island in Shanghai 6300km away, and its total drainage area is about 1.8×10 6 km 2 . The minimum annual rainfall of this basin is about 270 mm in the west and the maximum is about 1900 mm in the southeast. Affected by the monsoon climate, most of the precipitations are concentrated in summer, which also makes it a flood-prone season.
Yichang station, located at the end of the upper reach, is an important monitoring station for the upstream hydrological regimes. It is downstream of the two most important reservoirs in the middle Yangtze River, i.e. the Three Gorges Reservoir and the Gezhouba Reservoir, which will inevitably have an impact on downstream flow regimes. In this study, daily streamflow observations at the Yichang station from 1949 to 2012 were collected from the Changjiang Water Resources Commission, the AMF series were abstracted by the block maximum approach for flood frequency analysis in subsequent sections.

Trend test
The Mann-Kendall [7,8] test is a non-parametric method, which has been widely for detecting the trends in hydrologic time series. For a variable x with n years of records, the test is based on the statistic value S given by the following formula.
Under null hypothesis, the statistical S is approximately normally distributed with zero mean and its variance Var(S) = [n(n-1)(2n+5)]/18. The time series were examined with a two-sided test given a 0.05 significance level in this study.

GAMLSS framework
The GAMLSS [9] is a flexible tool for analysing nonstationary behaviors in time series and it was applied in this study to simulate the time varying moments of the distributions in flood frequency analysis. It supports a wide range of distributions compared with other frameworks, such as the Generalized Linear Models. The procedure for constructing the GAMLSS was briefly introduced below.
In a GAMLSS framework, it is assumed that the variable y follows a distribution with its PDF given as f (y|θT), in which the distribution parameters θT=(θ 1 , θ 2 , …，θ p ) can be described as a function of explanatory variables and random effects. In case there are no additional terms, the parameters can be expressed by a monotonic link function g k ( ), k=1,2,…,p, as, where Xk denote the explanatory variables, βk are polynomial coefficients; p is the number of parameters, q is the degree of polynomial.

Non-stationary GEV model
The GEV distribution is widely used for research of extreme values and it is a generalization of three extreme value distributions, including Fréchet, Weibull, and Gumbel. It is very capable of depicting with three parameters and was adopted to fit the AMF series in this study. The cumulative density function (CDF) and probability density function (PDF) of the GEV distribution are given using the following equations, where μ, σ and ζ are location, scale and shape parameters, respectively. Under non-stationary conditions, using the GAMLSS, the GEV distribution parameters can be expressed as a function of time. In this study, only the location parameter was considered to be varying with time, while the scale and shape parameters were treated as constants.
To avoid over-parameterization, q=1 and q=2 were considered in this study. The relationship between the parameters and time are denoted as, 0 1 ln( (t)) a at μ = + where t is time, a 0 , a 1 , b 0 , b 1 and b 2 are polynomial coefficients. The link function ln() can guarantee that μ(t) is positive, and also keep the coefficients at a small number. The Akaike Information Criterion (AIC) [10] was used for goodness-of-fit test.

Non-stationary return period
Return period is essential in flood frequency analysis and usually used in flood risk assessment researches. Under stationary assumption, the univariate return period is defined by: where P is the non-exceeding probability for a given hydrologic variable. Under non-stationary assumptions, equation (8) may become invalid since the non-exceeding probabilities for each year are different. To solve this problem, the concept of return period was extended to non-stationary cases as a result. In general, there are two ways to define the return period [11], one of which is the expected waiting time (EWT) until the first event occurs. For a known design flood event (x=0), X is defined as the waiting time until the event reoccurs for the first time, the probability for X=x can be expressed as follows, The EWT return period T of the design event can thus be denoted as: Another way to define return period is that the expected number of events (ENE) in T years is one. For a period of T years, N is the number of the design events, the expected value of which is calculated by, where I is an indicator variable. The ENE return period T is obtained when the solution to equation (11) is equal to 1, as follows, Generally, the definition of EWT includes calculation of the exceeding probabilities for an infinite period, which is hard to predict and has a risk of not converging. Thus, only the ENE definition was taken for calculation in this study.

Risk of failure
In recent years, the risk of failure has become increasingly popular [12] and it provides a different perspective for evaluating flood risks from the return periods. In practice, a hydraulic structure fails whenever the first design flood happens in its design life, and the risk may increase over time. For n years of design life, the probability that a critical event occurs at least once with time varying exceeding probabilities P i in the ith year is given as,

Trend analysis
The Mann-Kendall trend analysis was utilized first to test whether there is a trend in the AMF series. Given a 0.05 significance level, the threshold for a significant trend is ±1.96. The statistic value for AMF series is -2.06, presenting a significant decreasing trend. The result indicated that the stationary assumption can no longer be taken for granted in the streamflow series at the Yichang station, a non-stationary model is needed for hydrologic modeling in this region.

Parameter estimation
As defined before, three models were considered in this study, which are: (1) stationary model with constant parameters; (2) non-stationary model with degree of polynomial equal to 1, and (3) non-stationary model with degree of polynomial equal to 2. The unit of flood peak was set as 10 4 m 3 /s for the convenience of calculation. The estimated parameters and goodness-of-fit results were listed in Table 1. From Table 1, both model 2 and model 3 detected a decreasing trend in the location parameter, which is consistent with the M-K trend test. According to AIC values, non-stationary models have a better performance than the stationary model, however, the increased degree of polynomial do not provide a smaller AIC value. Hence, model 2 was selected as the non-stationary model for further research, and the results were compared with that of the stationary model.

Non-stationary flood risk assessment
Using equation (8) and equation (12), the return period for stationary and non-stationary were calculated and compared, the design flood peak for 20 years return period was plotted in Fig 1. It can be found that the design value for stationary model keep stable through the period, while it changed over time for the non-stationary model. The flood peak for model 2 is bigger than that of model 1 during the first 22 years, while it is the opposite afterward. It may attribute to the construction of reservoirs upstream, the main function of which is flood control. Compared with the stationary condition, the nonstationary return period is defined for each year rather than the whole series and it can reflect the trend in the AMF series.  Table 2. The design values for non-stationary model are smaller than the stationary model for all situations, indicating a safer flood condition in non-stationary model. However, special attention must be paid since the time varying trend obtained through past data may not continue for a long time, the reliability of predicted results need to be carefully considered. Given a design flood event of 20 years of return period under stationary model, the risk of failure was plotted in Fig 2. The risks increased with the design life for both models, but the risk for non-stationary is far smaller than the stationary model. The risk that a critical events exceeding design value happens at least once in the next 20 years for the stationary model is 0.64, while it is 0.24 for the non-stationary model, the conclusions are consistent with that of the return periods. The risk of failure turns out to be a useful tool to assess flood event in non-stationary models, and it has explicit physical meanings.

Conclusion and discussion
In this study, 64 years of AMF series at the Yichang station were used for non-stationary flood frequency analysis, the GAMLSS framework was applied to construct a non-stationary GEV distribution. The nonstationary return period and risk of failure were calculated and compared for flood risk assessment. The main conclusions are summarized below. (1) The Mann-Kendall trend test was utilized and a decreasing trend was detected in the AMF series, indicating that the flow regime at the Yichang station has changed and the flood conditions need to be reassessed. (2) A non-stationary GEV model with its location parameter varying with time linearly was selected, and it performed better than the stationary model according to goodness-of-fit test. (3) Compared with the stationary model, the design flood peak decreased using the non-stationary return period, which indicated a safer flood condition in non-stationary model. The risk of failure is also smaller given a design life in terms of non-stationary.
The findings above may be helpful for water resource management and flood prevention in the Yangtze River, the methodology in this study can be applied to other river basins as well.