Study on the relationship between PM2.5 and the interaction between air pressure and temperture based on GAM model

. This paper studies the influence of the interaction of two factors on the important air quality monitoring index PM2.5. Specifically, the GAM model based on the interaction of air pressure and temperature and the PM2.5 value is used to obtain the nonlinear relationship between air pressure, temperature and PM2.5. The GAM model has a high degree of fit, that is, the interaction of air pressure and temperature has a greater impact on the PM2.5 value. Therefore, the interaction between pressure and temperature can be used to predict the response variable PM2.5 accurately.


Introduction
The culprit of smog is PM2.5, and the pressure and temperature have a certain effect on the concentration of PM2.5. According to common sense, when the temperature in winter is low, the PM2.5 concentration is high, and when the temperature is high in summer, the PM2.5 concentration is relatively low, and it can be known from the analysis of the change characteristics of PM2.5 and the meteorological conditions during the pollution process. Higher air pressure is conducive to PM2.5 accumulation to form polluted weather, so the PM2.5 concentration increases [1], but during the maintenance of high concentration, it shows very different results. The formation of favorable diffusion conditions at high altitude ground pressure can quickly remove PM2.5, thereby the PM2.5 concentration reduces.
The main content of this article is to build a GAM model based on the pressure, temperature and PM2.5 in model based on the pressure, temperature and PM2.5 in the meteorological data of 45 cities. The last 4 cities use the established GAM model to make predictions, find the residuals, and use R language [2][3] to intuitively compare the difference between the predicted value and the true value, so as to quantitatively study the influence of the interaction of influencing factors on the change of PM2.5 concentration [4].

The GAM model
We assume that the scalar response i y , 1 i n   is independent of each other, and i y s Among them, 1 2 , h h is the length of the two observation points in the interval , D E , assuming that there is a regular observation grid between the two observation points.
The two main functions of the model (2) can be expanded in the spline bases, and the interaction term can be represented by the tensor product of two univariate spline bases, so the estimation method used in this paper is the smooth spline base.
an appropriate basis function and 1 2 , , the Kronecker product which is also called tensor product, , , , , , , is column vector. In addition, the confidence interval can be calculated as the estimated parameter function twice the estimated c Bayesian post-covariance matrix.

Modeling the interaction between PM2.5 and air pressure or temperature
Taking the time series of PM2.5 concentration, pressure and temperature in 49 cities from January 1st, 2017 to December 31st, 2017 as the research objects, the daily pressure, temperature and annual average PM2.5 concentration value in the meteorological data of 45 cities are used to establish the GAM model, the last 4 cities are used to test the GAM model, so as to study the relationship between air pressure, temperature and PM2.5 concentration change [6].

Determine whether the pressure and temperature are related to the PM2.5 concentration and whether there is a linear relationship
The GAM model was established by responding to the daily average air pressure and temperature values in 45 cities in 2017 and corresponding annual average PM2.5 concentration values [7], and using B-spline basis to obtain a smooth regression function of air pressure and temperature , and get the effect diagram of the influence of air pressure and temperature on PM2.5 concentration, at is to establish mod1 and mod2 mod1:  Note 2: The abscissa represents the true value interval of temperature, and the ordinate is the smooth fitting value of temperature to PM2.5 (smoothing the true value), the number in parentheses is the estimated degree of freedom,and the dotted line is the upper and lower limits of the confidence interval,the solid line represents the smooth fitting curve of PM2.5 concentration.
When the degree of freedom is 1, the function is a linear equation, indicating that there is a linear relationship between the influencing factors and the response variable PM2.5; when the degree of freedom is greater than 1, it means that the function is a nonlinear curve equation, and the influencing factors and the PM2.5 concentration changes. There is some kind of nonlinear relationship, and the value is larger , the nonlinear relationship is more significant.
The results in Figure 1 and 2 show that the estimated degrees of freedom on the vertical axis are all greater than 1,so the correlation between air pressure and the PM2.5 concentration shows a nonlinear relationship [8]. When the air pressure is greater than 75kPa and less than 85kPa, the PM2.5 concentration shows a steady trend. When the air pressure is greater than 85kPa and less than 105kPa, the PM2.5 concentration shows an increasing trend. The change shows that the temperature and the PM2.5 concentration has a non-linear relationship. When the temperature gradually increases in the range of -20 to 30, the PM2.5 concentration change is small and tends to be stable. This conclusion also shows the PM2.5 concentration not only affects by temperature, but also by other factors.  After a series of analysis of the established models, 2 R is 0.272 in model 3, but the parameters have not passed the significance test (p value is greater than 0.01), while are statistically significant, 2 R is 0.733 in model 5, and the interpretation rate is 76.9%, but it fails the significance test, so model 4 is more beneficial to study the interaction of influencing factors on the effect of the PM2.5 concentration changes.

Test model
Use AIC criteria to test the fit of the model The AIC criterion and ANOVA analysis are both standards for measuring the goodness of model fitting. The smaller the AIC value and the residual error, the better the model fit. Therefore, according to Figures 6 and 7, in the model 4 ,the AIC value is smaller , and the residuals in the ANOVA analysis are also small, which is more favorable to show that the interpretation rate and fitting degree of Model 4 are higher, that is, the interaction of influencing factors has a higher interpretation rate for the PM2.5 concentration changes.