Identification of PM 10 air pollution origins at a rural background site

Trajectory cluster analysis and concentration weighted trajectory (CWT) approach have been applied to investigate the origins of PM10 air pollution recorded at a rural background site in North-eastern Poland (Diabla Góra). Air mass back-trajectories used in this study have been computed with the Hybrid Single-Particle Lagrangian Integrated Trajectory (HYSPLIT) model for a 10-year period of 2006–2015. A cluster analysis grouped back-trajectories into 7 clusters. Most of the trajectories correspond to fast and moderately moving westerly and northerly flows (45% and 25% of the cases, respectively). However, significantly higher PM10 concentrations were observed for slow moving easterly (11%) and southerly (20%) air masses. The CWT analysis shows that high PM10 levels are observed at Diabla Góra site when air masses are originated and passed over the heavily industrialized areas in Central-Eastern Europe located to the south and south-east of the site.


Introduction
Variability of air pollutant concentrations recorded at a specific site, depending on the species, may vary on time scales from minutes to decades.Over a longer time scale the changes are determined by seasonal or annual cycles in the activity of emission sources, while on a time scale of days or less the concentration variability is closely related to the immediate history of the air before arriving at the sampling point [1].Back-trajectory analysis comprises a number of methods to identify transport pathways of air masses affecting a study site as well as to determine potential source areas of air pollution.In order to analyse the association between trajectories and concentrations of various species in air arriving at a site, a number of methods to carry out trajectory classifications has been applied.These can generally be split into the following methodological groups: (1) assigning trajectories according to air mass sectors and comparing the air pollution composition in those sectors; (2) clustering the trajectories using a multivariate statistical technique and analysing the concentrations at the receptor site for each trajectory classification; (3) residence time analysis which generates a probability density function identifying the likelihood that an air mass would traverse a given region en route to the site of interest over a given time period [2]; this often includes conditional probability densities linking residence times and the concentrations measured at the study site.
There has been a number of studies reporting the use of back-trajectory analyses for determining long range transport of air pollution (see e.g.[3,4]) or characterizing Saharan dust outbreaks in the Mediterranean Basin (see e.g.[5,6]) and Canary Islands (e.g.[7]).However, there is a lack of studies of air mass back-trajectories for Central-European countries.In scarce previous studies performed for this region, probable sources of particulate matter (PM) air pollution were identified by trajectory classification into wind sectors at urban site for Zagreb, Croatia [8] as well as back-trajectory analysis for selected days with the highest PM concentrations observed in Polish urban areas [9].A cluster analysis was also carried out for rural background site in Belsk, Poland [10] as well as for urban sites located in Bucharest, Romania and Szeged, Hungary [11].Different methods of trajectories interpolation were used for Polish urban sites in Gdańsk and Katowice as well as for rural background site in Diabla Góra [12].Moreover, cluster analysis, potential source contribution function (PSCF) and concentration weighted trajectory (CWT) methods were used to identify the main atmospheric circulation pathways influencing Black Carbon (BC) [13] and aerosol particle number concentration (PNC) [14] observed in Lithuania.Dvorska et al. [15] applied PSCF method in order to identify potential source areas of persistent organic pollutants (POPs) at rural background site in Kosetice (Czech Republic).
The aim of the present paper is to investigate potential source areas of air pollution contributing to PM10 levels recorded at a Central-European rural background site during the long-term period of 2006-2015.Cluster analysis of back-trajectories has been applied to assess the main transport pathways of air masses, while concentration weighted trajectory approach has been used for identification of potential PM10 source regions.

Study area
In this study air mass back-trajectories arriving over the monitoring site of Diabla Góra, Poland were investigated.Diabla Góra is a rural background monitoring site operated within the EMEP air quality network (European Monitoring and Evaluation Programme).This site is located in the north-eastern part of Poland (54° 08′ 00′′ N, 22° 04′ 01′′ E, 157 m a.s.l.), far from dense populated or traffic areas (Fig. 1).The nearest town (>10 000 inhabitants) and road (>50 cars per day) are situated 20 km and 16 km away from this site, respectively.

Back-trajectories calculations
Three dimensional 96-hour back-trajectories of air parcels arriving at Diabla Góra site at the height of 200 m above sea level were computed at 00, 06, 12 and 18 UTC each day during the period 2006-2015.The trajectories were calculated by the PC version of the Hybrid Single-Particle Lagrangian Integrated Trajectory (HYSPLIT) model [16].The meteorological data used for trajectory calculations was provided by the ECMWF (European Centre for Medium-Range Weather Forecasts).

Cluster analysis
The k-means procedure, which is the most commonly used form of non-hierarchical clustering for trajectory studies, is an iterative algorithm that uses a specified number of clusters, k, to partition the data by comparing each object to the arithmetic mean of all the members of each of the clusters (cluster centroids).The selection of the optimal number of clusters that best describes the different air flow patterns is performed by computing the percentage change in within-cluster variance, as a function of the number of clusters [17].The assignment of trajectories to a given cluster is carried out by minimising the internal variability within the group of trajectories and maximising the external variability between different groups based on the trajectory co-ordinates based on the Root Mean Square Deviation (RMSD) of each trajectory from its cluster centroids [2].
In the present study the k-means cluster analysis of back-trajectories following the procedure described in detail by Orza et al. [18] was carried out.Hourly latitude and longitude were used as input variables in the clustering procedures, while the similarity metrics was based on great-circle distances.The selection of the appropriate number of clusters was done by obtaining the best solution for each number of clusters (from a large number of initializations) and then by the analysis of the percentage change in the total RMSD when decreasing one by one the number of clusters.
Daily PM10 data was assigned to a cluster for a given day, if three out of four trajectories for that day fell into the same cluster.Non-parametric pairwise Mann-Whitney tests with the Dunn-Sidàk adjustment for multiple comparisons were used to test the significance of inter-cluster variation in PM10 concentration.

Concentration weighted trajectory
The concentration weighted trajectory (CWT) is a method of weighting trajectory residence times with associated air pollutant concentrations [19].This procedure assigns to each grid cell a weighted concentration obtained by averaging sample concentrations associated to trajectories that crossed that grid cell, i.e. each concentration is used as a weighting factor for the residence times of all the trajectories in each grid cell which correspond to that concentration and then it is divided by the cumulative residence time from all trajectories.CWT method employs an arbitrary weight function to minimize the inaccuracy caused by the small number of polluted trajectories.
In summary, weighted concentration fields show concentration gradients across potential sources and helps to identify the relative significance of potential sources [19].

Cluster analysis
Trajectories arriving at Diabla Góra at 200 m are found to be clustered in 7 groups (Fig. 2).The groups correspond to the major advection patterns at the study site are named according to the main direction they come from.Most of the trajectories correspond to westerly flows (44% of the cases), bringing air masses from the northern part of the North Sea, through Denmark and North Poland (NW, 19.6%), North France, Central Germany and western part of Poland (WSW, 15.5%), as well as from the Atlantic Ocean, through Great Britain, southern part of Denmark and northern part of Poland (W, 9.1%).The second type of trajectories groups the northerly flows (over 25%), bringing air masses from Scandinavian Peninsula, passing over northern part of Russia, Latvia and Lithuania (NNE, 16.8%), as well as from the southern part of Greenland, through Scandinavia and the Southern Baltic Sea (NNW, 8.6%).The third type of flow includes short (regional) trajectories, bringing air masses from highly polluted Southern Poland (Upper Silesia and Lesser Poland) through central part of the country (Sreg, 19.4%).The last type of trajectories includes the air masses coming from western part of Russia, through the Belarusian-Ukrainian border and the eastern part of Poland (E, 11.2%).
The highest mean PM10 concentrations (26 μg/m 3 ) observed at Diabla Góra site are associated with slow moving easterly flows (Fig. 3).As these flows are mainly associated to the westward extension of the Siberian high located to the east of the study site, cluster E is also associated with the highest mean air pressure -1 001 hPa (with maximum of 1 044 hPa and one of the lowest mean air temperatures of 5.0 °C (with minimum of -21.5°C).Quite high PM10 concentrations (25 μg/m 3 ) are also related with much warmer (9.2°C, with minimum of -11.5°C) slow moving air masses from the South (Sreg).In opposite, the lowest mean PM10 concentrations (10 μg/m 3 ) are recorded when air masses arrive with fast moving NNW flows (Fig. 3).
Overall, the Mann-Whitney test results have shown that mean PM10 concentrations associated with slow moving easterly and southerly air masses (clusters E and Sreg) were significantly higher compared to those observed for other clusters.

Concentration weighted trajectory
The concentration weighted trajectory map for PM10 in Diabla Góra during study period is shown in Fig. 4. Cells with high (above 30 μg/m 3 ) and moderate (above 20 μg/m 3 ) CWT values were found south and south-east of Diabla Góra site, respectively, in good accordance with the previous analysis of PM10 levels by advection pattern.The high CWT values for short lengths of the S and E trajectories indicates stable situations with reduced removal of pollutants and slow air movement over Central-Eastern Europe due to the influence of highpressure centres.As a result, polluted air masses with a high PM10 content, originating from southern and eastern part of Europe may accumulate during several days under anticyclonic situations and they may remain there for several days before reaching the study site.

Summary and conclusions
A database of 96-h back-trajectories arriving to North-eastern part of Poland (Diabla Góra) at 200 m computed for the 10-year period from January 2006 to December 2015 was applied to describe the main flow patterns over the study site, as well as to identify PM10 air pollution origins at the rural background site.A cluster analysis grouped backtrajectories into 7 clusters.Most of the trajectories correspond to fast and moderately moving westerly (45% of the cases) and northerly (25%) flows, followed by the short (regional) trajectories, bringing air masses from the South (20%) and from the East (11%).Mean PM10 concentrations associated with slow moving easterly and southerly air masses are significantly higher compared to those observed for other clusters.The CWT approach has identified the heavily industrialized areas in Central-Eastern Europe as probable PM10 source areas.This work was partially supported by the Dean of the Faculty of Building Services, Hydro and Environmental Engineering, Warsaw University of Technology within the Grant no.504/02637/1110/42.000100.The authors also gratefully acknowledge the NOAA Air Resources Laboratory (ARL) for the provision of the HYSPLIT model and the European Centre for Medium-Range Weather Forecasts (ECMWF) for making available the ERA-Interim database.

Fig. 2 .
Fig. 2. Cluster centroids of the trajectories reaching Diabla Góra, Poland at 200 m in the period 2006-2015.The percentage of trajectories occurring in each cluster is shown in parentheses.

Fig. 3 .
Fig. 3. Boxplots of PM10 concentrations (μg/m 3 ) for each trajectory cluster arriving at Diabla Góra, Poland.The horizontal lines and dots indicate the median and mean values, respectively, the boxes cover the 25th-75th percentiles.The length of the whiskers is 1.5 times the interquartile range.