Research on the air quality prediction model of Wuhai mining area based on deep learning

With the large-scale and high-intensity mining of coal resources in the Wuhai mining area, the destruction of soil and erosion of rocks has intensified, causing a large amount of surface soil spalling from the mine body and serious damage to the surface vegetation, which has had a serious impact on the quality of the environment in and around the mine. This paper focuses on the corresponding early warning research on air quality in the mining area of Wuhai, and constructs Deep Recurrent Neural Network (DRNN) and Deep Long Short Time Memory Neural Network (DLSTM) air quality prediction models based on the filtered weather factors. The simulation results are also compared and find that the prediction results of DLSTM are better than those of DRNN, with a prediction accuracy of 92.85%. The model is able to accurately predict the values and trends of various air pollutant concentrations in the mining area of Wuhai.


Introduction
Wuhai is one of the important coal production bases in China, located in the southwest of Inner Mongolia and the western border of the Ordos massif. The city contains a large amount of coking coal underground, with reserves occupying 58.8% of the total amount in the region.However, in recent years, with the vigorous exploitation of mineral resources, the ecological environment of Wuhai has been severely damaged and the situation has deteriorated rapidly.

Data sources
In this paper, the weather data of Wuhai area are obtained by the method of reptile, including air quality data and meteorological data.Data from www.pm25.in / wuhai.Using hourly monitoring data for the period from 1 January 2015 to 4 December 2019, for each point in time, three stations collect data simultaneously, that is, 24 times a day, with a total of 72 data collected by the three sites.

Data standardization
The initial data will be standardized to avoid a computational failure due to abnormal convergence of the model caused by an unstandardized dataset. According to the unclear distribution of the original data, Z-SCORE is chosen to standardize the raw data.
The principle of Z-SCORE standardization is to calculate and derive the average value and mean value of the data, and the value of the interval near the zero value can be used to represent the original data value. Analyze the processed data through (0, 1) standard normal distribution. Such as formula (1): x : average difference of all elements  : Standard deviation

Correlation analysis and prediction factor selection
In this paper, the principal component analysis is used to select prediction factors from a large number of data items and select those that have a greater impact on air pollution concentration.We also pay attention to seasonal effects on air quality, and ensure the accuracy of air-quality prediction models. Select the data values from 00:00 on July 4, 2018 to 00:00 on July 10, including air quality data such as PM 2.5 , O3, CO, NO 2 , PM 10 , and SO 2 concentrations. Use MATLAB R2014a to analyze air quality data and meteorological data. Correlation values are shown in table 1: The correlation between the data items is determined by the correlation coefficient. As shown in Table 1, taking PM 2.5 as an example, the correlation between PM 2.5 and meteorological factors is low, and it is significant with data items such as CO, SO 2 , NO 2 , and O 3 Correlation, and PM 10 belong to high correlation. By calculating the correlation coefficient, PM 10 , PM 2.5 , O 3 , CO, NO 2 , and SO 2 are finally selected as the input data of the prediction model.

Deep recurrent neural network (DRNN)
Deep Recurrent Neural Network is one of the deep learning algorithms, and its main function is to construct a network of sequence data.

Simulation results and analysis
Select the data of March 1, 2019 as the test set, input the trained Deep Recurrent Neural Network prediction model, and then compare the predicted value of the model with the true value. The result is shown in Figure 1    The predicted experimental results of the deep recurrent neural network air quality forecasting model for each forecasting project are recorded as shown in Table 2. It can be seen from Table 2 that the prediction accuracy of the model is affected by the number of hidden layers when the deep recurrent neural network is used to predict the nodes. Table 2 shows the record of the prediction accuracy of different hidden layers. When the number of hidden layers is 8, the accuracy of the prediction model is the highest, reaching 88.70%.

Deep long and short time memory neural network (DLSTM)
Although the prediction model based on the recurrent neural network performs better in air quality prediction, the processing efficiency of the prediction model based on the recurrent neural network begins to decline when encountering data with a longer output sequence. To avoid this problem, LSTM was chosen to build the air quality prediction model.

Simulation results and analysis
Select the data of March 1, 2019 as the test set, input the trained deep and short-term memory neural network prediction model, and then compare the predicted value of the model with the true value. The result is shown in Figure 7-12.    The results of the prediction experiment results of each prediction item of the deep long and short time memory neural network air quality prediction model are shown in Table 3: The above table shows that the accuracy of LSTM based prediction model is affected by the number of hidden layer nodes. When the number of hidden layers is 7, the prediction results is the best, and the accuracy is 92.85%. Compared with the air quality prediction model based on deep recurrent neural network, the air quality prediction model based on deep LSTM can more accurately predict AQI, SO 2 , CO, NO 2 , O 3 , PM 10 , PM 2.5 concentration Equivalent, achieved the expected effect.

Conclusions
This paper uses DRNN and DLSTM to build an air quality prediction model, and conducts experiments on the collected data. The experimental results show that the accuracy of the air quality prediction model based on DLSTM is 92.85% higher than the accuracy of the air quality prediction model based on DRNN, which is 88.70%. Forecast of air quality and its changing trend in Wuhai mining area. This research result can provide powerful technical support for ecological security in Wuhai mining area, and it is very important to standardize the production and accelerate the ecological recovery of mining areas.