Cluster and correlation analyses of the chemical composition of groundwater nearby MSW landfills

In areas of the European North with a low population density and poorly developed transport infrastructure, it is difficult to organize waste removal and storage at large landfills, therefore many settlements have small landfills of municipal solid waste (MSW), and the practice of placing them in swamps is very common, which leads to the migration of pollutants into adjacent rivers. Environmental monitoring nearby such landfills is hampered by high costs of periodic water sampling and chemical composition analyses. It is possible to decrease the number of substances under analysis and thus reduce costs by determining the marker substances, the concentration of which in groundwater near the landfill has stable correlations with the content of other pollutants. The article provides an analysis of the pollutant concentrations in 45 MSW landfills around the world. The results of cluster and correlation analyses made it possible to identify marker substances and establish the target correlation dependences.


Introduction
In areas of the European North with a low population density and poorly developed transport infrastructure, it is difficult to organize waste removal and storage at large landfills, therefore a small waste landfill is generated near almost every settlement, and the practice of placing them in swamps is very common. This leads to the migration of pollutants into adjacent rivers.
The processes of sorption and desorption of pollutants by peat have not been studied enough yet. According to various authors, some substances are securely settled in peat, others are transferred by groundwater. Moreover, the effect of peat composition, temperature and other factors on the experimental results was not usually taken into consideration during the research [1,2,3].
So far, researchers do not have achieved any common opinion on what processes prevail in peat after the termination of waste dumping, and monitoring was carried out in the vicinity of existing landfills, i.e. under conditions of the constant supply of pollutants.
According to the research data, previously one of the authors of this article determined that in the Arkhangelsk region in the swamps subjected to man-made effects from hydrolysis and forest industry enterprises such pollutants as oil products, phenols, ammonium salt spread into the entire depth of the peat deposit, and their content can exceed the maximum permissible concentration (MPC) by ten and hundred times. Long-term monitoring showed that after the termination of waste storage and wastewater discharge, the release of pollutants from the swamp was not over for 16 years, and the prevailing factor in reducing their concentration in the peat deposit was removal by groundwater [2].
Thus, the study of the chemical composition of groundwater and the identification of specific pollutant substances will make it possible to establish the dependences necessary for predicting the distribution of pollutants at the base of the MSW landfills in small settlements.

Analysis of scientific publications on groundwater pollution at the base of MSW landfills
In order to determine the specific pollutants in groundwater at the base of MSW landfills, a review of scientific publications by Russian and foreign researchers was carried out. Data on 45 locations in different parts of the world were summarized, including Africa -6, Europe -5, the Middle East -4, Asia -16, and Russia -14 ( Figure 1).
Analysis of the data made it possible to identify 22 substances, the presence of which was found in groundwater in concentrations significantly exceeding the standard values. The widest range of substances at the base of one storage facility was observed in Sri Lanka -19 substances, the least number of substances (4) was detected in landfills in Sweden, Iran and Pakistan. This can apparently be explained by the specific features of domestic waste management in a given country, as well as differences in monitoring systems and requirements for chemical analysis of water samples.
The concentration of pollutants in groundwater also varies over a wide range. So, for example, the concentration of zinc varies from 0.4·10 -3 mg/l to 632 mg/l, i.e. it differs by 1.6 million times, the concentration of cobalt changes from 6·10 -3 mg/l to 1.12 mg/ldiffers by 187 times.

Cluster analysis of the climatic factors effect on groundwater pollution
The study of the presence or absence of statistically significant dependence of the pollutant concentration in groundwater on climatic factors was started with a cluster analysis in the SPSS Statistics 23.0 software, the first stage of which was the selection of groups of objects with similar values of the factors.
The classification features for hierarchical clustering are as follows: average annual temperature, °C (attribute X) and average annual precipitation, mm, (attribute Y).
Clustering was performed on the basis of quantitative assessment of similarity of objects as a measure of distance between them. The Euclidean distance was taken as a measure of distance between the objects: where x i,j and y i,j are the coordinates (values of attributes X and Y respectively) of i-th and j-th objects.
At the first step of the cluster analysis, the elements having the smallest measure of distance were combined, and they formed primary clusters. Then, at each step, the nearest element (or a cluster) was attached to each primary cluster. The process ended when all the elements merged into one cluster.
The results of clustering, which allowed distinguishing two groups of drives according to the climatic conditions of the territories of their location, are presented in Figure 2 and Table 1.
Statistical processing of data on the pollutant concentration and clustering thereof as per two parameters -the area of the MSW landfill and the average annual amount of precipitation -did not give statistically reliable dependences and did not allow identifying the appropriate groups. The methodology for statistical analysis was the same as for clustering by the temperature and the amount of precipitation.
Probably, in addition to the influence of the above-mentioned specific features of municipal waste management and climatic conditions, the essential effect came from the geological and hydrogeological conditions of the storage location. It is not possible to estimate the contribution of the latter factor judging by the available publications.    In order to compare the extent of groundwater contamination at the base of the landfills in two climatic groups, 18 of the 22 pollutants were selected that are present in groundwater samples in both groups. In each group, the data on the average pollutant concentration was statistically processed, and the logarithms of the multiplicity of MPC exceedance were calculated ( Table 2).
The groundwater contamination index was determined for each group, which was calculated as the arithmetic mean value of the pollutants' MPC exceedance multiplicity logarithms. It turned out that a slightly lower extent of groundwater pollution is observed in Group 1. The decrease in the pollutant concentration and the multiplicity of the MPC exceedance respectively is probably caused by a stronger dilution of groundwater with abundant atmospheric precipitation. So, the average annual precipitation in Group 1 is 2041 mm, and in Group 2 it is equal to 606 mm.

Cluster analysis and identification of correlations among pollutant concentrations
To establish correlations among the concentrations of various pollutants, as well as to identify pollutant markers, while having knowledge on the concentrations thereof for potential statistical evaluation of reliable concentrations of other substance, a multivariate statistical analysis was performed for the entire data set. Based on the analysis results, 3 clusters were identified ( Figure 5). Cluster A includes heavy metal cations, many of which are highly toxic. Compounds of these metals are widely used in the chemical industry, galvanic production, etc. Their sources of waste are electrical products, such as lamps, batteries, etc.
Cluster B is represented by ammonium cations, alkaline and alkaline earth metals, as well as by sulfate, nitrate, chloride and phosphate ions. The salts generated from these ions are mainly soluble compounds. Sources of their presence in water can be natural minerals (calcite, apatite). They are contained in the products of the chemical industry; in particular, they are part of mineral fertilizers. Cluster C represents organic compounds: oil products and phenols. The sources of oil products in groundwater are the effluents from industrial enterprises, petrol stations, and the landfills -containers with waste lubricants. Phenolic compounds can stem from industrial wastewater of the chemical complex enterprises. In addition, phenolic compounds are formed in natural conditions during the transformation of native aromatic compounds, mainly macromolecular (lignin) ones.

Correlation analysis of pollutant concentration in Cluster A
To perform the correlation analysis in Cluster A, decimal logarithms of concentrations were calculated, a correlation matrix was created, in which functional connections with the Pearson correlation coefficient with p-level of statistical significance < 0.10 were singled out.
The matrix analysis showed that 3 statistically significant elements can be identified as marker pollutants: zinc (Zn), plumbum (Pb) and ferrum (Fe), which have the greatest number of links with other pollutants (Figure 5). Figure 6 shows the dependences of the Pb concentration on the content of Fe (a), Cu on the content of Pb (b), and the remaining dependences are presented in Table 3

Correlation analysis of concentration of pollutants in Cluster B
The correlation analysis in Cluster B was performed by analogy with that of Cluster A. The analysis of the correlation matrix showed that 2 statistically significant elements could be identified as marker substances: sodium (Na) and potassium (K), which have the greatest number of connections with concentrations of the other substances ( Figure 5). At the same time, chloride ion does not have stable correlations with the marker substances and should be considered as a separate marker. The coefficients of the regression equations for the concentration of marker substances with concentrations of other pollutants are presented in Table 4. Figure 7 displays the dependence of the Mg concentration on the K content (a) and the dependence the nitrate ion concentration on the chloride ion content (b). Correlation analysis of cluster C showed that no statistically significant correlations were established between the two substances (Fig. 5).
Thus, the analysis of groundwater contamination at the base of MSW landfills allowed to identify 3 clusters of pollutants, to determine the allocation of marker substances and to establish statistically significant correlations between their concentration and the concentration of other pollutants. For verification of the obtained dependences results of monitoring the composition of groundwater at the base of landfill of Arkhangelsk city were used. Eleven pollutants (K, Na, Mg, Ca, ammonium-, chloride-, nitrate-, sulphate ion, Pb, Fe, Cu) were under monitoring. Chemical analysis of groundwater samples taken near MSW landfill in the Arkhangelsk City has shown that the pollutants concentration at the landfill in the City of Arkhangelsk corresponds to the correlation dependencies identified by the literature data.

Conclusions
Analysis of scientific publications allowed determining the list of pollutants coming from MSW landfills into the groundwater. These substances were divided into three groups, the substance markers were identified, by the concentration of which the correlation dependencies make it possible to estimate the content of other substances in groundwater in each group.