Evaluation of multisite synthetic data generated by spatial weather generator and long climate data series

. In this paper a new validation test for the spatial weather generator SWGEN producing the multisite daily time series of solar radiation, temperature and precipitation is presented. The method was tested by comparing statistics of 1000 years of generated data with extra long series of 35 years of observed weather parameters and 24 sites of meteorological stations for south-west Poland. The method evaluation showed that the means (sums) and variances of generated data were comparable with observed climatic data aggregated for months, seasons and years.


Introduction
Compared with the well-known weather generators producing the data for the simple station, a spatial weather generator is much more sophisticated method from the mathematical point of view. The main idea behind constructing the spatial weather generator is an application for the complex processes evaluation in a given region including correlations among sites. The most well-known applications of spatial data simulated by spatial weather generators are used in hydrology and water management. Particularly, the forthcoming hydrology in the river catchments, runoff response, and predictions are strongly impacted by future climate and possible change [1][2][3][4][5][6][7][8][9][10][11][12][13]. The daily flow simulation, particularly seasonal extremes for future climate conditions given by different scenarios are important for water management and hydrology for several reasons [10][11][14][15][16][17][18]. Above applications indicate that high-quality spatial weather generator is required. A high-quality spatial weather generator means that the generated data are same distributed as the climate data [10,[19][20]. This means that basic statistical measures such as averages, variances and others are very close.
The paper presents a new evaluation of the spatial weather generator SWGEN based on long climate data series and 24 stations from south-west Poland.

Spatial weather generator − SWGEN
The idea of simulation of spatial data by the spatial weather generator is widely described in the literature [1,6,13,[16][17][21][22][23][24]. For several applications in Poland the spatial weather generator SWGEN is used as the best downscaling method to produce n years of synthetic daily data on potentially possible weather course at k stations [9,[25][26]. The SWGEN model generates total precipitation by means of the first-order Markov chain to determine the occurrence of wet/dry days, and then for the amount of precipitation the multidimensional two-parameter gamma distribution is used [8,11,18]: where m is the month number (m = 1, …, 12, i.e. January = 1, Feburary = 2, ..., December = 12) and k is the location number. Daily values of solar radiation (SR), temperature maximum (Tmax) and minimum (Tmin) are treated as a multidimensional time series AR(1) in the following form: where X t and X t-1 are vectors (3k  1) of standardized values for all three variables for day t and t-1, ε t is a vector (3k  1) of independent random components normally distributed with vector of means equal to zero and matrix of covariance Σ m , and Ф m (for m = 1, …, 12) is a matrix of parameters [8,11,17].

Data generation
According to the model of spatial weather generator, characteristics of required meteorological parameters: solar radiation (SR), minimum and maximum temperature (Tmin and Tmax) and precipitation total (P) were estimated for each station of the area. In some cases of missing data, solar radiation values were obtained from sky cover measurements (IS) according to the Black formula [25,27]. The characteristic of each analyzed meteorological parameter was represented by its monthly mean value and standard deviation. In addition, the spatial correlations among variables from all stations were added to the characteristics. For the stations with the lack of measurements of a given meteorological parameter, these characteristics were obtained with the use of interpolation techniques: ordinary kriging and inverse distance weighting method. The selection of better interpolation techniques was performed with the use of cross-validation method with the criteria based on the value of RMSE (root mean square error) [25,[27][28]. Then, the spatial weather generator SWGEN is used to produce new long series of 1000 years (comparing to previous study [25][26][27]) of synthetic data for 24 stations.
For the comparison of observed and generated data, the procedure shown in Fig. 1 is applied. Fig. 1. Diagram of evaluation of generated data (see also [27]).

Meteorological data and study area
The data simulation using the scheme (Fig. 1) was applied to a south-west region of Poland in the basin of the Odra River accounting for the area of approximately 65000 km 2 (Fig. 2). Daily data of solar radiation, maximum and minimum air temperature, and total precipitation of a 35-year data series (1981−2015) of the meteorological network within the study area, were obtained for 24 stations from the Institute of Meteorology and Water Management National Research Institute (Table 1).

Results
The SWGEN model was examined by comparison: observed data vs. generated data. Daily data of solar radiation (SR), minimum and maximum temperature (Tmin, Tmax) and precipitation total (P) were aggregated to annual, two hydrologic seasons (from April to September (H-S) and from October to March (H-W)) and 12 monthly periods (January, ..., December) for all 24 stations. For all synthetic data as well as for the data from climatology and for given periods, means (sum) and standard deviation (SD) were determined. For the above computations, 1000 years of generated data and series of 35 years of observed data were used. Next, absolute differences (abs) between observed and generated parameters (mean, SD), as well as relative absolute differences (Rel) in the form Rel = abs(observedgenerated)×100% /observed were evaluated [13,29].
Obtained results are presented in Table 2. The above study and results are comparable to earlier research [25][26]. However due to much longer data series used for the test, higher number of stations and larger area of study, new results are significant. Table 2 shows that the spatial weather generator reproduces satisfactorily daily values of meteorological variables with proper averages, sums and standard deviations.
The averages for generated solar radiation (SR) and both types of temperature (Tmin, Tmax) correspond to the observed ones. As in previous research, the higher values of temperature (Tmax vs. Tmin) for simulated data had a higher standard deviation than the observed ones and, as well, the standard deviation of simulated Tmax was slightly higher than that of observed Tmax values. The observed error defined as the relative absolute difference (Rel) does not exceed 7%.
Evaluation of the spatial weather generator was made by means of produced data also in terms of its ability to reproduce the precipitation distribution by comparison of generated and observed daily precipitation totals and standard deviation for all collected stations.
Although simulations show good compliance for the sum of precipitation totals, for observed and generated data, some uncertainties were identified for standard deviations.