Modelling Arsenic and Lead Surface Soil Concentrations using Land Use Regression

Land Use Regression (LUR) models are increasingly used in environmental and exposure assessments to predict the concentration of contaminants in outdoor air. We explore the use of LUR as an alternative to more complex models to predict the concentration of metals in surface soil. Here, we used 55 soil samples of As and Pb collected in 1996 across British Columbia (BC), Canada by the Ministry of Environment. Predictor variables were derived for each sample site using Geographic Information System (GIS). For As (R = 0.44), the resulting linear regression model includes the total length of roads (m) within 25 km, and bedrock geology. For the Pb model (R =0.78), the predictor variables are the total surface area of industrial land use (m) within 5 km , the emissions of Pb (t) within 10 and 25 km, and the presence of closed mines within 50 km. The study proposes that LUR can reasonably predict the concentrations of As and Pb in surface soil over large areas.


Introduction
Models that interpolate metal concentration in surface soils are often data and computer intensive, while demanding elaborate statistical analyses (Davis et al., 2009), and applied to small geographic areas.In air pollution modelling, simple linear regression, usually described as Land Use Regression (LUR), has been used to predict ambient concentrations using basic geographically derived independent variables, with comparable results to other more complex methods (Jerrett et al., 2005).More recently, Wu et al. (2010) also applied LUR approach to predict the concentration of Pb in soil within city.
This study uses the LUR method to predict concentrations of As and Pb in surface soil across BC, Canada.Model variables include natural and anthropogenic emission sources and controlling factors.Once developed, the LUR models can be used to predict concentration levels wherever the relevant geographical variables are available, and thereby derive a modelled concentration surface to aid in exposure assessment for health-related studies.

Data
Surface soil samples at 55 sites in BC were collected between 1994 and 1996 by the Ministry of Environment.The samples were analyzed by ICP-OES, after an Aqua Regia digestion.Multiple samples taken at the same location were averaged and values below detection limits were excluded.Table 1 shows the concentration value range, mean and median for both As and Pb.bedrock geology, etc.) were also added to the first fitted models.Next, backward and stepwise selections were performed to identify the most significant variables.Then, variables that were collinear (variance inflator factor > 10), not significant, or had a relationship contrary to that expected were eliminated.The two previous steps were repeated until the final models included only significant variables and all the regression assumptions were met, including normally distributed residuals with low spatial autocorrelation.The sensitivity of the models was examined by bootstrap analysis to assess how the model parameters (coefficients and R 2 ) vary when the LUR models were sub-sampled.Random selections of the soil concentration data, with replacement, were repeated 10,000 times and the respective parameter results for 95% confidence interval were recorded.Statistical analyses were conducted using the open-source software R 2.14.1.

Results
The final LUR models explain the variation of the metals in the surface soil of As and Pb at 44% and 78%, respectively.Tables 3 and 4 summarize the model parameters.The significant predictors (the variance explained by the variable is in bracket) for As include the length of local roads (log) within 25km (28%), and the bedrock geology (9%); the parameters for Pb include industrial land use within 5km (10%), the industrial emissions of Pb within 25 km (log) (4%) and 10 km (7%), and the presence of closed mines within 50 km (12%).Table 5 shows the bootstrap 95% confidence intervals for the R 2 and the individual coefficient variables.The R 2 ranges from 0.3 to 0.9 with a mean of 0.63 for As and from 0.35 to 0.95 with a mean of 0.80 for Pb.For As, bedrock geology is sensitive to the resampling as it varies from a negative to a positive coefficient.

Discussion
This study shows the potential of using LUR to predict the concentration of As and Pb in surface soil at regional levels.Overall model performance (As r2 = 0.44 and Pb r2 = 0.78) is similar to that obtained for air pollution models (Jerrett et al., 2005).
For the As model, bedrock geology reflects the natural source of the metalloid in soil, which agrees with other studies (Garcia-Sanchez and Alvarez-Ayuso, 2003;Grosz et al., 2004); however, the bedrock variable is qualitative and assumes continuous and homogenous As concentration in the bedrock unit, which may not be the case (Garcia-Sanchez and Alvarez-Ayuso, 2003).The road variable may be a surrogate for regional human activities related to the utilization of As-containing pesticide and wood preservative, combustion of fossil fuel (mainly coal plants), and other mining activities that are known to increase the concentration levels of As in the surrounding soils (Wang and Mulligan, 2006; Garcia-Sanchez and Alvarez-Ayuso, 2003).However, the relative higher mobility of As in soil may explain the lesser performance of the model compaired to Pb as past emissions may have leached down from the surface.
For Pb, the predominant predictors are the industrial land use surface area and the Pb emission point sources (NPRI).This finding agrees with other studies that demonstrated a strong influence of industrialization on the Pb surface soil concentrations (Murray et al., 2004;Salonen and Korkka-Niemi, 2007;Wu et al., 2010) and is reflected in the map (Figure 2).In fact, Wu et al. (2010) used a similar spatial approach than this study at city scale, and found strong relationships between Pb concentration levels in soil and proximity to roads and land use type.In opposition, this preliminary Pb model does not include any transportation related variables, which is unexpected due to the legacy of Pb-gasoline emissions from vehicles, as reported by several studies (Adriano, 2001;Wu et al., 2010).Unlike more complex dispersion models, the LUR approach does not typically consider factors controlling the deposition rate and direction from sources.Many studies reported that topography roughness, wind speed and direction influence atmospheric deposition, and thus the concentration of As or Pb in soil (Adriano, 2001;Fritsch et al., 2010)..The use of circular buffers to define geographic variables can result in patterns in the predicted surface that may not reflect actual distribution.In this study, there is a discrepancy between the date of the data collection of the soil samples (1994 -1996) and certain predictor variables (see Table 2).The predictor variables were used as indicators of human activities and we assume that the changes in the transportation network, population density, and land uses are minimal.For the NPRI, the emission levels are used as indicator of the location of the source.The emission data are reported from 1994, but we assume that the emission sources were active prior the beginning of the monitoring program and could be used as emission intensity indicators.Finally, the maps represent the situation at the time of the data collection (1994 -1996).Any remediation or contamination done after the data collection is not accounted for in the present study.
Nonetheless, LUR soil models for As and Pb have a promising future as screening tools for remediation strategy and, more important, for human exposure assessments.

Table 1 .
Statistic Descriptive (µg/g) At each sample location, the predictor variables (Table2) were derived using a range of circular buffer sizes (100m to 50km).Other site specific variables were derived using the sample location value of the geographic variables (elevation, bedrock geology, etc.).

Table 3 .
Regression Analysis Results for Arsenic a Other categorical variable values are not shown here

Table 4 .
Regression Analysis Results for Lead

Table 5 .
Bootstrap Results for As and Pb a Obtained after 10,000 iterations b Only the significant factor is presented in this table