Research on Population Spatialization Method Based on PMST-SRCNN

. How to improve the accuracy of population spatialization by using downscaling technology has always been a difficult issue in academic research. The population spatialization model constructed from the global or local perspective alone has its own limitations that cannot capture the local and global characteristics of the population distribution. Based on the counties of Chongqing municipality in 2010, this paper uses the two steps of " removing-rough " rasterizationof partitioned multivariate statistical regression and the " getting-accuracy " of super-resolution convolutional neural network to construct a coupling model of population spatialization to complete global and local Feature learning and compare and analyze with other four schemes. The results show that the mean square error and root mean square error of the coupled model of partitioned multivariate statistical regression and super-resolution convolutional neural network are the lowest, especially in densely populated areas. Studies have shown that although super-resolution convolutional neural network has a good ability to downscale learning, it still does not reflect the heterogeneity of population spatial patterns well, and the coupling of multilevel global feature learning models and super-resolution convolutional neural network models can make up for this to a certain extent.


Introduction
Population spatialization is a technique that inverts the population data at a certain point in the study area into a region distribution close to the real population, and solves the spatial limitation of census data. At present, population spatialization data are widely used in disease disaster management, urban planning, etc [1] . Downscaling was first used to transform climate data from low resolution to high resolution [2] , and the spatialization of statistical population is also a process of downscaling. The population downscaling spatialization technology can be roughly divided into two categories: global models and local models.
In the global model, the area-weighted model meets the low-resolution data requirements by calculating the average population within the area. Although the application of traditional mathematical models [3][4][5] is supported by mathematical theory, it cannot describe the complexity of the spatial distribution of population. And the spatial interpolation method [6][7][8][9] can transform demographic data into fine spatial units. However, with the development of remote sensing and GIS technology, multi-source heterogeneous data such as land cover data [10] , buildings [11] , residential areas [1] , and night lights are used in spatial interpolation models to make the population spatialization results more realistic, the most common of which is the method of partition density mapping [12][13] . In recent years, partition density mapping methods have begun to be combined with machine learning methods represented by tree models [14] and multiple regression models. The random forest model is widely used in population spatialization research because of its high flexibility [15][16] and the measurability of the importance of variables [17][18][19][20] . The multiple regression model [21][22] considers the correlation between different influence factors [23][24] . In view of the global model's difficulty in characterizing the heterogeneity of population spatial distribution, local models such as geographic weighted regression [23][24] and super resolution convolutional neural network (SRCNN) have recently been introduced into population spatialization practice. However, the geographic weighted regression model reduces the simulation accuracy due to the influence of complex terrain [23] . Based on the commonality of single-image super-resolution and downscaling technology [25] , Zong introduced the SRCNN model for the first time to Shanghai's population spatialization research, and obtained better results [26] .
Although various downscaling technologies have greatly improved the accuracy of the results of population spatialization, the accuracy of population spatialization is still a problem in the academic community [27][28] . A single model can only extract local or global features separately, and it cannot get rid of the influence of complex terrain [23] . Therefore, this paper integrates global and local models, and adopts two steps of three types of rasterization "removing-rough" and super-resolution convolutional neural network "getting-accuracy" to construct a population spatialization coupling model to achieve Chongqing's 500m resolution population spatialization in 2010.

Research area and data processing 2.1 Introduction to the study area
Chongqing is located in southwestern China, with an area of 82,400 square kilometers. The terrain gradually reduced from the north and south directions to the Yangtze River valley. The main stream of the Yangtze River runs from west to east across the whole territory. The northwest and central parts are mainly hills and low mountains. And Wuling Mountain. Since 2018, Chongqing has a total resident population of more than 30 million, ranking first in China's cities, and has become the city with the most attractive population following Beijing, Shanghai, Guangzhou and Shenzhen. Chongqing's population distribution is constrained by the complexity of the terrain, and the spatial heterogeneity is obvious. Therefore, choosing Chongqing as the study area can better test the regional adaptability of the spatialization model.

Data sources and data processing
The population data set used in this article includes demographic data and high-resolution raster population data. The demographic data of the district and county household registration population data are from http://www.nature.com/sdata [29] . In 2010, the urban / rural population data were obtained from household registration data of the Chongqing Data website http://www.cqdata.gov.cn/.
High-resolution raster population is used as model label data (to implement model verification), which is derived from WorldPop website https://www.worldpop.org.
The population distribution is affected by factors such as the natural environment and socioeconomic factors. This paper selects the data of elevation, slope, land use type, residential area, night lights, and river data as the main factors affecting population distribution. The vector data of rivers, roads, railways, and administrative boundaries are derived from free streetscape data; land use data are derived from global land cover data with a resolution of 30m (GlobeLand30); night light, residential areas, elevation, and slope data are derived from WorldPop website 100m resolution covariate data set, in which the residential area data is a map of urban and rural residential areas with a 100m resolution generated by scholars such as Jeremiah J using random forest model. The night light data is a 100m resolution calibration value (0 ~ 6300) obtained by multiplying the standard uncalibrated DMSP light source data (0 ~ 63) by 100.
The above heterogeneous data are processed consistently (uniform research scope, unified spatial coordinate system is GCS_WGS_1984, unified grid resolution is 500m), and a unified spatial database is established. Among them, the river is vector data. By using ARCGIS to calculate the Euclidean distance, a distance factor from the river is formed. Road and rail vector data is merged into traffic line data. The land type data is resampled to 500m resolution; and the original land types are combined into four categories according to the density of each type of land population, that is, wetlands and water bodies are used as unmanned waters, and forests and shrublands are combined into Forest land, cultivated land, and grassland are combined into cultivated(grass) land (the number of grassland types in Chongqing is small), and construction land remains unchanged; at the same time, each land type is extracted to form 01 map, which is used as the land category impact factor for population distribution.

Research ideas
The low-resolution population data collected by this institute is census data based on districts and counties. First, in the "removing-rough" module, three different levels of fine-grained rasterization models are used: area weighted average, land weighted average, and partitioned multivariate statistical regression, to achieve different levels of global feature learning and to complete 500m resolution rasterization of district and county census data. The result of "removing-rough" was writted as the low resolution population (LR). Then, the SRCNN model is constructed in the "getting-accuracy" module. The results of the "removing-rough" and the factors of population distribution as features, and the 500m-resolution grid population product as label are input to the SRCNN model for hybrid learning to implement global and local features integration to obtain population spatialization results with higher spatial accuracy. Finally, verification methods such as mean square error (MSE), root mean square error (RMSE), and spatial residuals are used to verify and analyze the spatialization results of various coupling models.
In order to meet the requirements of SRCNN for the spatial position of the input data, all raster data is converted into an ASCII matrix that maintains the spatial position relationship and input into SRCNN. In order to allow SRCNN to accurately learn the characteristics of Chongqing's borders, 127 districts and counties including the matrix range of Chongqing were processed, and the administrative scope of Chongqing (40 districts and counties) was cut during the model checking part for accuracy verification.

Area Weighted Average Population Rasterization (AWA)
The area weighted average (AWA) method enables the statistical population to be evenly distributed within the district and county administrative units (here, the unit of population density is: person/hm 2 ). Calculated as follows: Here, LR ki represents the population of the i-th grid in the k-th administrative unit. P k is the population of the kth administrative unit, and S k is the area of the k-th administrative unit (unit: hm 2 ).

3.2.2
Land weighted average population rasterization (LWA) Land weighted average(LWA) refers to the average distribution of urban/rural population in urban/rural land within a statistical unit to extract the global features of urban/rural population distribution in a statistical unit. The 2010 urban/rural population data was collected from the statistical yearbook, and the urban/rural demographic data were mapped to 500m resolution rasterized data based on the type of land use to complete the LWA. In order to correspond to the urban/rural population, the land type is roughly divided into urban residential land and agricultural land (rural residential land, cultivated land, forest, shrubbery and grassland) and water areas (water body, wetland) Three major categories. According to the proportion of urban and rural population in each district and county, weights are given to urban residential land and agricultural land, and the weight of the water area is directly given as 0. The calculation equation of the weighted average of land category is as follows: LRkhi =P kh / S kh (2) Here, LR khi represents the population number on the i-th grid of the h-type land under the k-th county (unit: person/hm 2 ). P kh refers to the total population of the htype land in k-th county, that is, the number of urban (or rural) population in k-th county(unit: person). S kh represents the total area of the h-type land in k-th county(unit: hm 2 ).

Partition multivariate statistical regression Population Rasterization (PMSR)
Partition multivariate statistical regression (PMSR) model makes statistical units with similar population distribution characteristics as a partition, and establishes a multivariate statistical regression model of population and district/county area under each partition to achieve statistical population to grid population Conversion. The PMSR method can be used to obtain multi-level, refined, and global population distribution characteristics from partition to counties to land types, which better reflects the consistency and difference of population distribution in the region. PMSR is mainly achieved through three steps: population characteristics partition, land type grading, and multiple statistical regression.
Step 1: population characteristics partition. By analyzing the correlation among population density, various natural and socio-economic factors and the proportion of the land types area, a total of 10 indicators are used to achieve population characteristics partition.
First of all, a total of 127 districts and counties in Chongqing and its surroundings were partitioned by correlation analysis. The scatter plot was used to analyze the correlation between the population density index and other indicators, and the study area was roughly divided into two categories. Five districts and counties with a population density higher than 1500 people/km 2 are classified as high population density areas, and those with a population density lower than 1500 people/km 2 are classified as low population density areas. The area with high population density has a negative correlation with the proportion of elevation and cultivated land area, and the population is mostly distributed on construction land. The population density of low-density districts and counties is negatively correlated with elevation, slope, and woodland. Based on this, 1500 people/km 2 was used as the dividing point to perform the first zoning of population characteristics. Chongqing's Yuzhong District, Jiangbei District, Nan'an District, Shapingba District, Dadukou District, and Jiulongpo District are regarded as high population density areas (Partition 1), and the remaining 121 districts and counties are regarded as low population density areas.
Then, in the low population density area, correlation thresholds are used to screen out factors highly correlated with population density, and K-means clustering analysis is performed to make population distribution characteristics within the same partition similar. By calculating the Pearson correlation between population density and various indicators, it was found that population density and elevation, slope, night light intensity, residential area ratio, cultivated(grass) land area ratio, construction land area ratio, forest land area ratio At a significance level of 0.01, the correlation coefficients are all greater than 0.6, showing a high degree of correlation. These 7 impact factors were selected, and the low-density areas (121 districts) were divided into 5 partitions again using the K-means cluster analysis tool in SPSS. Finally, atotal of 127 districts and counties in Chongqing and surrounding provinces were divided into 6 large partition.
Step 2: Land grading in the population characteristics partition. Because night light data can reflect the population's living information to a certain extent, it has been used as an important factor affecting population distribution in previous studies. This article uses night light data to further partition the population density within the construction land and cultivated(grass) land areas. In order to avoid excessive classification and extreme values that interfere with the determination of the overall threshold, the construction land and cultivated(grass) land in each partition according to the natural discontinuity point, it is further subdivided into two grade, and the cultivated(grass) land with a large area and a night light value of 0 is separately divided into a single grade. Finally, the subdivided land types are: first-grade artificial land (relatively low light intensity value, low population density), second-grade artificial land (high light intensity value, high population density), first-grade cultivated(grass) land (light intensity Value is 0, population density value is very low); second-grade cultivated(grass) land (low light intensity value, lower population density value) and third-grade cultivated(grass) land (higher light intensity value, low population density) (Fig. 3.d).
Step 3: Multivariate Statistical Regression by partition. Calculate the subdivision land area (hm 2 ) of each district and county as the independent variable of multivariate statistical regression under each partition, and use statistical population of each district and county as the dependent variable. Construct a multivariate statistical regression model under each partition to obtain the population distribution coefficient of subdivision land type under each partition, achieve multivariate statistical regression rasterization after correction.
The principle of multivariate statistical regression for land classification is that it is assumed that within the study area, the internal population of the same soil is uniformly distributed, with different land use areas as independent variables, and demographic data as dependent variables. A multivariate statistical regression model is established to obtain each soil. The population distribution coefficient of this category is used to simulate the county-level population distribution model. The general form of the model is as follows: Here, P jk is the statistical population of the k-th county under the j-th partition, and β jh is the population distribution coefficient (person/hm 2 ) of the h-th land category under the j-th partition. S jkh represents the area (hm 2 ) of the h-th land class in the k-th county under the j-th partition. According to the principle of "no land and no population", the constant term value is 0. The area of each subdivided land class is put into the model as an independent variable. If there is no population in the water area, 0 is given. Since the total population of each district and county and the area of each category are known, the population coefficient of each subdivided land type under the j-th partition can be calculated through multiple statistical regression (table 1). The population coefficient is substituted into equation (3) to achieve a rasterized population distribution of multiple statistical regression.
Because the model assumes a uniform distribution of the population in the same land type, the demographics obtained by the model will deviate from the actual population. In order to ensure that the simulated population is consistent with the actual statistical population in each district and county, the population coefficient of each district and county is corrected by the following equation: In the equation, β kh is the corrected population coefficient of the h-th land category in the k-th county, βh is the population coefficient calculated in equation 3; P k is the statistical population of the k-th county, and P k ' is the total population of the k-th county calculated by the model. After obtaining the corrected population distribution coefficient of each district and county (Table  1), it is brought into equation (4) to obtain the final regional multivariate statistical regression rasterized population data.

Super-resolution convolutional neural network (SRCNN) model
Single image super-resolution (SR) is a classic problem of computer vision. Convolutional neural network (CNN) has a powerful role in image classification. In 2014, C Dong and others proposed a super-resolution convolutional neural network (SRCNN) model to realize the mapping of low-resolution to high-resolution images, which solved the problem of image super-resolution in computer vision [30][31] . First, the low-resolution image X LR is interpolated to the target resolution size using the bicubic interpolation technique, which is denoted as X. Then, the task of SRCNN is to make the image F(X) after passing the three-layer convolutional neural network as similar as possible to the real target image Y. That is, the mapping function F(X) is optimized by minimizing the objective function (5): Where is the parameter of the convolutional neural network, which represents the number of training samples.
The mapping from X to F(X) includes three steps, each corresponding to a convolution operation.
Patch extraction and representation. Slice images from X in a certain step (that is, extract overlapping patches), and then pass each convolution layer to represent each patch as a high-dimensional vector. The function is represented as follows: Here, a patch of size 33×33 is used, W 1 is 64 convolution kernels of 9×9×c, 9×9 is the size of each convolution kernel, and c is the number of channels. B 1 represents a set of 64-dimensional offsets. Using the ReLU activation function, each piece of 33×33 patch can get a high-dimensional representation (64dimensional).
Non-linear mapping. Each of the 64-dimensional vectors in function (6) is non-linearly mapped to a 32dimensional vector, so the convolution kernel size used here is 1×1. The function is represented as follows: Here, W 2 is the 32 convolution kernels of 1×1×64 in the second layer, and B 2 is a set of 32-dimensional offsets. This operation aggregates the 64-dimensional features in equation (6) so that each pixel is mapped nonlinearly to another high-dimensional (32-dimensional) vector.
Reconstruction. This is equivalent to a deconvolution process. The task of this layer is to aggregate the 32dimensional features in equation (8) to form a highresolution picture result, which is F(X).
Where W 3 is a convolution kernel of size 1×5×5×32, and B 3 is a 1-dimensional offset. This operation aggregates the 32-dimensional features in equation (7) and outputs a result close to the target.
Considering the influence of the influencing factors on the population distribution, the experiment uses the grid population and different influencing factors with the form of channels as the input of the SRCNN. The number of channels depends on the amount of auxiliary data. Through patch extraction and representation, nonlinear mapping and reconstruction, the output of the model is obtained. The task of population spatialization is to make the model output as similar as possible to the actual population data. Therefore, a standard backpropagation stochastic gradient descent method is used to minimize the mean square error between the model output and the actual population distribution to obtain high resolution population distribution results. Figure 4 shows the implementation steps of the SRCNN model.

Model checking
In this paper, the output results of each coupling model are clipped to the scope of Chongqing, and the model errors are quantified from two perspectives, numerical and spatial. Numerically, this article uses the mean square error (MSE) and root mean square error (RMSE) to measure the overall error of the model output. In space, the residual error between the label and the simulation result is used to visualize the spatial error. A negative value indicates that the simulation shows an overestimated result, and a positive value indicates that the simulation result is underestimated.

Experimental scheme
Based on three population rasterization models based on area weighted average (AWA), land weighted average (LWA), and partition multiple statistical regression (PMSR), this paper constructs a coupling model with SRCNN, respectively, and forms five combinations to achieve population spatialization . The five combinations are: Scheme 1: AWA+SRCNN. The results of the AWA rasterization of the population were evenly distributed by district area, reflecting the general characteristics of population distribution between districts and counties. The AWA+SRCNN model combines district-level global features with local features.
Scheme 2: LWA+SRCNN. The results of the LWA rasterization of the population are evenly distributed according to the area of the district and county, which reflects the general characteristics of the population distribution between districts and counties and the distribution characteristics of the urban and rural population in the districts and counties. Therefore, the LWA+SRCNN model integrates urban and rural global and local features in districts and counties.
Scheme 3: PMSR+SRCNN. The results of the PMSR population rasterization first carried out partitioningof the study area, and then under each partition, the population distribution coefficient was calculated for the county-level population according to 7 land types. PMSR not only realized more macro-regional demographic feature extraction, but also reclassified construction land and cultivated (grass) land at the county level, and more comprehensively grasped the overall characteristics of population distribution from top to bottom. Scheme 4: AWA+LWA+RSCNN. According to scheme 1 and scheme 2, AWA+LWA combines the characteristics of population distribution between counties and the distribution of urban and rural population in the county.
Scheme 5: AWA+LWA+PMSR+RSCNN. The combination of scheme 1 and scheme 2 and scheme 3 was realized, and three different levels of global characteristics were obtained, from partition to county and subdivision land type.

Results analysis
The five schemes use the same impact factors and through 100 iterations of training, the population spatialization results shown in figure 5 are obtained. The results show that, under the combined action of various natural and social factors, after SRCNN "gettingaccuracy", the five schemes all show good simulation results. Due to the mountainous terrain in the southeast and northeast of Chongqing, and the relatively low topographic features in the central and western regions, the population distribution is affected by the elevation gradient. Generally, the population in the central and Affected by the type of land use, densely populated areas are concentrated in urban land, followed by rural land, and the distribution of forest land is less.

Precision inspection
This paper uses the mean square error (MSE) and the root mean square error (RMSE) to test the model numerically to test the model error. It can be seen from In the experiment, when using LWA rasterization, it is easy to get too high a population density value in a district or county with a large population but a small area of urban or rural land. Because SRCNN has requirements for data quality, scheme 2 obtained the worst result, and the overall accuracy of the method including the weighted average rasterized population of the ground class was also lowered. This paper uses the spatial residual distribution to test the spatial error of the model ( figure 6). In the residual distribution diagram, a positive (red) residual value represents the model's underestimation of the population, and a negative (blue) residual value represents the model's overestimation of the population. In general, SRCNN has a good ability to downscale learning, and the residual values of the five schemes are mostly less than one person. However, in the areas with less population distribution or in the sparsely-densely populated alternate areas, the overestimation of (1, 10) people is present, and in the areas with more concentrated population, the underestimation of (1, 10) people is present. Near the most densely populated central urban areas (such as Yuzhong District), errors of more than 50 people occurred. Of all the schemes, scheme 2 LWA+SRCNN has more overestimation in areas with less population distribution in southeast Chongqing. The residual range of the PMSR+SRCNN coupling model is located at (-100, 107). Compared with the other four schemes, the residual value of the PMSR+SRCNN coupling model is the smallest. However, in scheme 4 and 5, because the experiment has added different levels of global population distribution characteristics, the simulation results are closer to the real population distribution in spatial distribution. The main manifestation is that in the northeast, southeast, and near the water body with less population distribution, the spatial residuals obtained by scheme 4 and 5 are lower.
In this study, the partition sampling method was used to examine the distribution of spatial residuals in different population characteristics partition( figure 6). And the MSE of the four sampling regions of ABCD (table 2) was statistically evaluated to evaluate the performance of each coupled model. Sampling area A is located in the first and second characteristics partition(the most densely populated areas), area B belongs to the third characteristics partition (more populated area), area C belongs to the fourth characteristics partition(sparsely populated area), and area D belongs to fifth characteristics partition (the most sparsely populated area). According to table 3, scheme 3 has the smallest error in the most densely populated area A, while Options 4 and 5 have smaller errors in the areas B, C, and D. It is shown that the more refined global features extracted by the partitioned multivariate statistical regression rasterization model optimize the simulation effect of SRCNN in densely populated areas. By extracting global population distribution characteristics at different levels, SRCNN makes the simulation effect in the middle population density area better. However, in areas with low population density, the simulation effect of SRCNN is relatively poor. Among the 5 schemes, the densely populated area A has the largest error compared to several other areas. The overestimation of the urban center area may be caused by the excessively high nighttime lighting values in the urban prosperity area, while the underestimation is mainly near the river in the main urban area, SRCNN failed to learn the sudden changes in population.

Conclusion and discussion
In this paper, the regional rasterization "removingrough" and SRCNN "getting-accuracy" are used to construct a population spatialization model to realize integrated learning of global and local characteristics of population distribution in the study area. Five experimental schemes are formed by coupling different levels of partition rasterization methods with SRCNN, and the accuracy verification and comparative analysis of the results of each scheme are made. The main conclusions are as follows: (1) Under the combined effects of natural environmental factors and socio-economic factors, SRCNN has a good ability to learn downscaling, but according to the results of scheme 1 and 2, it can be seen that the 9 × 9 convolution kernel used by SRCNN Extracted are very local features, which are affected by complex topography, causing overestimation in some areas with a relatively small population, failing to grasp the overall characteristics of population distribution. Therefore, SRCNN cannot well learn the global characteristics of population spatial distribution.
(2) The mean square error shows that the PMSR+SRCNN coupled model has the highest simulation accuracy and the smallest spatial error. The maximum value is concentrated in the area with the highest population density, so PMSR+SRCNN reduces the simulation error in areas with high population density.
(3) Scheme 4 and 5 show lower spatial errors in areas with relatively small populations. The mean square error of scheme 4 and 5 is higher than that of scheme 3, but as the global features at different levels increase, scheme 4 and 5 perform better than scheme 3 in a sparsely populated space. But overall, scheme 3 is the best, because PMSR has learned the features of SWA and LWA through the previous feature partitioning during the learning process, integrating the common advantages of scheme 1 and 2. Although scheme 5 repeatedly adds global and local features, the learning effect is not better than the PMSR+SRCNN coupled model. Studies have shown that by combining global and local features, constructing a "removing-rough and getting-accuracy" population spatialization model can not only better realize the regional or heterogeneous global characteristics of population spatial distribution in the "removing-rough" module, It can also learn the local characteristics of the population spatial distribution in the "getting-accuracy" module. The PMSR+SRCNN model coupling to some extent makes up for the shortcomings of a single model's lack of global or local feature learning.
(4) Due to the weighted average of the land type, the value of a small part of the rasterized population is too high, and the accuracy requirements of the input data of SRCNN are high, so the LWA+SRCNN model has obtained a poor simulation accuracy. The comparison of multiple scenarios shows that the quality of the input data will have a greater impact on the simulation results of the SRCNN model.
The "removing-rough and getting-accuracy" coupling model shows the SRCNN's ability to learn multi-level global population distribution characteristics, but the model error is higher in some abrupt regions (such as the main urban area of Chongqing where the river passes). It may be because the 9×9 convolution kernel in SRCNN is too small to fully learn the abrupt texture. The high error of the model in densely populated areas has a lot to do with the city's night lights. The night light values of Chongqing Hongyadong, Jiefangbei and other scenic spots are very high, but this does not really reflect the population, which makes the model appear in densely populated areas overestimated. This also shows that although the night light has advantages in population spatialization on a wide area scale, it has this defect in fine population spatialization. In the future, more sophisticated population spatialization research can be carried out with the help of other auxiliary data, such as urban functional zoning or mobile phone signaling data. Of course, under the analysis of the natural and economic conditions that affect the spatial distribution of the population, reasonably partition the study area and construct the correlation between these partitions and the population in order to improve the data quality of SRCNN input, this is a question that needs to be further explored in the future.