Identification of Urban Rainstorm Waterlogging Based on Multi-source Information Fusion:A Case Study in Futian District, Shenzhen

Flood disasters have become one of the most threatening natural disasters in the world, in which waterlogging is the most common form in the context of highly urbanized megacities. The formation of flood disaster is related to many factors and involves information from multiple sources, making it difficult be predicted. This paper integrates multi-source information data, classifies the study area into different categories according to hydrological analysis results, and combines hydrodynamic theory and ArcGIS to get the quantitative prediction of the range and depth of waterlogging under different rainfall inputs. The evaluation results provide the government with accurate and timely information of waterlogging risks and locations in order to improve promptness of emergency management such as evacuation and managing traffics.


Background
With the fast pace of urbanization, cities have been expanding rapidly with boosting population density and rate of land development. Flood disasters have become one of the most threatening natural disasters in the world given their high frequency, wide coverage and significant destructiveness. As a typical type of flood disaster, urban waterlogging, commonly caused by heavy rain, has become an important factor that seriously threatens urban safety. In developed cities, rainfall accumulates in a large number of low-lying areas formed in the process of urbanization. Insufficient drainage capacity exacerbates the risk of waterlogging [1].
When encountering some extreme rainstorm or typhoon events, the intensity of the rainstorm far exceeds the designed drainage capacity, and the surface water cannot be discharged in time, resulting in surface runoff and stagnant water [2]. The identification of urban waterlogging and inundation risk can help predict the distribution range, depth and flow rate of accumulated water under different rainstorm intensities. Besides, visualization technologies are desired to be applied to generate rainstorm risk maps providing decision-making suggestions for predicting urban waterlogging under extreme rainstorms [3]. Therefore, for the vulnerable areas, early warning information should be issued to the citizens in advance to help urban decision-makers formulate targeted traffic control measures and emergency control plans.
For megacities with large population, numerous buildings, and complex municipal engineering systems, especially the coastal ones that are directly exposed to typhoon disasters, risk of urban waterlogging is much higher [4], yet not much research has focused on such scenarios. The current research combines SWMM, MIKE FLOOD and other models to simulate an area with detailed drainage information and analyze the risk of urban waterlogging. The Futian District of Shenzhen located in southern China is selected as a typical highly urbanized area in a global megacity, upon which waterlogging risk analyses are conducted based on multiple rainfall input scenarios with practical policy implications suggested.

Literature review
Statistical methods with advanced algorithms have been applied to identify vulnerable areas of floods. Chau

ICESD 2021
Several scholars also use hydrodynamic models to simulate flood routing. Samela et al. applied the idea of river hydrology to flood routing simulation in 2015. Chen developed a two-dimensional coupled model based on SWMM and SWM to analyze rainstorm and waterlogging in 2019 [8]. However, hydrodynamic models require highly precise data, among which detailed urban underlying surface data is generally necessary. Although relatively fine simulation results can be obtained, the comprehensive and large-scale calculations are commonly required. Therefore, it is difficult to be applied to large-scale urban flood risk identification.
In recent years, many scholars have used machine learning methods to identify inundation areas based on historical inundation data. Lamovec et al. used machine learning methods such as decision tree (DT) and random forest (RF) to detect flood-prone areas in 2013 [9]. Tehrany et al. tested the computational efficiency of DT in this application. Some scholars also applied integrated algorithms such as adaptive-network-based fuzzy inference system (ANFIS) and genetic algorithm based artificial neural network to identify waterlogging risk areas [10]. Ke et al. applied machine learning approaches to simulate urban flooding in the Shenzhen city [2].
With the improvement of the accuracy of remote sensing images and digital elevation models (DEM), the submergence information extraction model based on remote sensing images is used to identify the submergence risk. This method compares the remote sensing images before and after the occurrence of flood disaster in a certain area to directly extract the submerged range and depth of water. Combined with meteorological data, it can provide reference for risk identification and early warning. However, due to the lack of simulation of the flood process and the scarcity of remote sensing satellite resources, it is difficult for them to be widely used in a short time [11].
As the process of formation of waterlogging is complicated, the above studies have considered many factors that may lead to urban waterlogging. Generally, factors such as DEM, aspect and slope, rainfall intensity, land cover type and drainage capacity will be considered. However, some of these studies are still in the qualitative risk analysis stage and can only assess the risk of waterlogging in certain areas with only part of the crucial factors considered [11]. Fusion of multi-source information that integrates meteorological, geographic, and municipal engineering information is of great need. Combining statistics and hydrodynamic models is capable of quantifying certain details of the process and more precisely identifying the coverage and depth of the waterlogging area, which is more conducive to the government to judge the extent of disaster damage, classify risk levels and take early warnings.

Study area and data
The study area is the Futian District, which is the central district of the Shenzhen city located in the Guangdong Province, China. The total area of the Futian District is 78.8 square kilometers. The terrain is mainly composed of plains, hills, mountains, and beaches. There are mountains in the north and the sea in the south. Located in the south of the Tropic of Cancer, Futian District in Shenzhen is a subtropical maritime climate zone with sufficient rainfall, with an average annual rainfall of 1,866 mm. The rainfall is mainly concentrated in June and August. In August, there are frequent typhoons and heavy rain extreme weather. Futian District has a high level of urban development and high economic density. In 2020, the permanent population will exceed 1.66 million. Although the area only accounts for 4% of the city's total area, its GDP accounts for 16.9% (454.6 billion Yuan) of the city's GDP (2692.7 billion Yuan). Table 1 shows the factors related to the formation of urban waterlogging disasters for this study, including the data sets corresponding to each factor and the scale and precision of each data set.

Influential factors
The formation of urban rainstorm waterlogging is related to many factors, mainly containing three categories: positive correlation factors, distribution influencing factors and negative correlation factors [12].

Positive correlation factors
Incremental factors, including rainfall intensity and rainfall duration, are positively correlated with the scale of waterlogging. The return period of a specific rainstorm intensity refers to the average interval time between the occurrence of a rainstorm intensity greater ICESD 2021 than or equal to this value, and it is inversely proportional to the frequency. As the rainfall intensity and rainfall duration increase, the total rainfall increases.

Distribution influencing factors
Such factors are mainly related to topography, landforms and surface structures. Rain causes surface runoff to continue to increase, and overflowing surface runoff flows horizontally from high terrain to low terrain areas under the influence of gravity. The continuously increasing surface runoff cannot be discharged in time by the drainage system, thus forming stagnant water in low-lying areas. In terms of rainfall process, the range and depth of stagnant water generally increase continuously.

Negative correlation factors
Certain factors have a negative correlation with the formation of waterlogging, including infiltration capacity, drainage capacity, and evaporation. This paper uses multi-source data fusion technology to divide the research area into 596 irregular subcatchments based on geographic characteristics and confluence processes. Each sub-catchment contains all the attributes of factors related to the risk of waterlogging. Taking the sub-catchment area as the basic calculation unit, combined with the Manning equation in hydrodynamics, Horton infiltration curve and other formulas to calculate the depth of water accumulation and flow velocity, the calculation results with higher accuracy can be obtained. Use the analysis and visualization functions of ArcGIS to draw a waterlogging risk map that includes the range and depth of stagnant water.

Rainfall model
The Futian District of Shenzhen often suffers short-term heavy rainfall brought by typhoon disasters in August. In order to simulate a variety of extreme rainstorm that the area may suffer, based on the rainstorm intensity Equation (1) published by the Shenzhen Meteorological Bureau in 2015, the return period is set to 5, 30, and 100 years, and the peak rainfall is 0.4. By using Chicago rainfall process model, three simulated rainfall input models are obtained, as shown in Fig. 1.
where q is the rainfall intensity; a is the return period of heavy rain; t is the rainfall duration; A, C, b, n are the rain force formula parameters.

DEM with building height
The original DEM can only display surface elevation data, not including buildings on the ground. But in the process of flood routing, water flow will be blocked by buildings and change its flow direction, which will eventually affect the spatial distribution of stagnant water. The Futian District has a high density of buildings, especially high-rise buildings. In order to simulate the blocking effect of buildings on water flow, the vector data containing building height information is superimposed with the original DEM using ArcGIS raster calculator. Fig. 2 is the new DEM coupled with building height information.

Analysis of water flow process
The water flow process is mainly based on the D8 algorithm. The principle is to assume that the water flow in a single grid can only flow into the lowest one among the 8 adjacent grids (Fig. 3). It uses the steepest slope method to determine the direction of the water flow (Fig.  4). It is characterized by fast calculation speed and can well reflect the effect of topography on the formation of surface runoff. The D8 algorithm can be used on the DEM data that has been filled to obtain the unique flow direction value of each grid. The hydrological analysis tool of ArcGIS can calculate the flow value of all the grids in the Futian area and further calculate the flow accumulation (Fig. 5, Fig. 6). This value represents the number of grids that the target grid receives upstream confluence. The flow accumulation cannot represent the final flow result, but it provides theoretical support for runoff movement and sub-catchment division [13].

Sub-catchment division
In the urban-scale waterlogging prediction research, in order to improve the prediction accuracy and calculation efficiency of the model, the research area can be divided into tiny sub-catchments based on the results of hydrological analysis. When dividing the sub-catchment area, we define the lower limit of the catchment area that can generate surface runoff as 40,000 square meters, which is equivalent to a rectangular area of 200m × 200m. Confluence areas smaller than the lower limit are no longer divided separately. It is unified into other confluence areas. The basis for this division is that on one hand, it is difficult to form large-scale stagnant water in an area generally smaller than 200m × 200m in reality; on the other hand, it is to improve the calculation efficiency of the model. In the study, the catchment area of 40,000 square meters corresponds to the cumulative value of 1600. Fig. 7 further classifies and links the cumulative flow. Fig. 8 shows the 596 sub-catchments.

Surface runoff coefficient
The Futian District is the central district of Shenzhen, with 5909 hectares of construction land, accounting for 74.9%. Urban construction land is mostly asphalt, concrete, and cement surfaces, which have high surface runoff coefficients and weak rainfall infiltration capacity. The runoff coefficient of woodland, grassland and wetland is small, and the ability to accumulate rainfall is strong. According to the land cover type data in the Futian District (Fig. 9), the surface runoff coefficient in this area is 0.95 for construction land, 0.65 for broadleaved forest and grassland, 0.65 for coniferous forest, 0.7 for water bodies, and 0.75 for wetland and irrigated fields [14]. In view of the actual situation of land use in Futian District, this study extracts the proportion of each land type, and obtains the average runoff coefficient of each sub-catchment through weighted calculation (Fig.  10).

Evaluation of drainage capacity
By analyzing the distribution of 37,845 rainwater outlets in the Futian District, a generalized evaluation model of drainage capacity was established. Almost all rainwater discharge outlets in the Futian District are flat rectangular discharge outlets. According to the number, shape and size of the drainage outlets in each subcatchment area, the drainage capacity per unit time can be expressed as Equation (2). Combined with rainfall duration, the total drainage volume of each subcatchment area can be calculated [15].
where Qd is the drainage flow (m³/s); μ is orifice flow coefficient; s is drainage area (m 2 ); h is the water depth (m).

Depth of waterlogging
Different types of data from multiple information sources are merged with sub-catchments as storage units. Each sub-catchment contains DEM with building height, average slope gradient, average surface runoff coefficient, catchment area, average roughness, rainfall, drainage intensity and other index attributes [12].
Although each sub-catchment area is divided according to an independent water catchment process, the flood routing has integral characteristics. When the Futian District is used as a unified research area, the water flow connection of each sub-catchment area needs to be considered. Because urban roads are generally lowlying, rainwater drains are on both sides of the road and the road network has strong connectivity. In the flooding routing, floods flow and spread through urban roads in the form of surface runoff. By analyzing the 685 main roads in Shenzhen, the city includes 390 arterial roads, 264 first-class highways, 17 expressways, and 14 highways. Because expressways generally have high roadbeds or an elevated form, waterlogging is not easy to occur. The average width of 671 arterial roads, first-class highways and expressways in the studied area is 23.3 meters. Assuming that the water flow between the upstream and the downstream sub-catchment areas is mainly connected by the lowest road and the road is simulated as a U-shaped channel, Equations (3) to (7) can be used to determine the flow, velocity, and depth of water accumulation [16].
where Qr is the surface runoff flow (m³/s); q is the average rainfall intensity (mm/min); φ is the average runoff coefficient of the sub-catchment; A is the area of the sub-catchment (m 2 ); a is the unit conversion factor, which is equal to 1/60000 in this case.
The surface runoff velocity, v, can be calculated as: where v is the surface runoff velocity (m/s); R is the hydraulic radius; i is the average slope gradient of the sub-catchment area (%); n is the surface roughness.
where S is the cross-sectional area of the water (m 2 ), C is the Chézy coefficient.
The Chézy formula can be expressed as: 1 6 R C n = (6) in which the hydraulic radius, R, is commonly expressed as: where w is the width of the water cross section (m), that is the average width of the road; X is the wet perimeter of the water cross section (m); d is the depth of waterlogging (m). Waterlogging generally appears at the boundary of the sub-catchment area, because these areas are the lowest point of terrain and have a large flow accumulation value.

Results and discussions
After the range and depth of the waterlogging under each rainfall input scenario are calculated, the waterlogging results can be superimposed on the satellite image for display through ArcGIS. Fig. 11 shows the spatial distribution of waterlogging with depths. The darker color represents the deeper waterlogging. The depth of the waterlogging (Table 2) ranges from 0 meters to 0.189 meters for the 5-year-return period, from 0 meters to 0.225 meters for the 30-year-return period, from 0 meters to 0.246 meters for 100-year-return period. Comparing the three rainfall models, as the rainfall intensity increases, the area and depth of waterlogging increase, and the risk of waterlogging increases. There are 159, 175, 186 catchments with waterlogging for three different return period of heavy rain. Average depths of waterlogging are 6.94cm, 7.78cm and 8.35cm, respectively. In Fig. 11, the distribution of waterlogging is mainly concentrated in low-lying areas, almost no water in high altitude areas. At the foot of the mountain, because of the insufficient drainage facilities in the mountainous area, the rainfall flows down the slope and gathers in the low-lying areas at the foot of the mountain. The drainage facilities in the urban area are very complete, and the accumulated water can be discharged timely. The result of the waterlogging in the southwest and southeast is that these two areas are new land formed by reclamation and lack drainage data in these areas. The municipal department should strengthen the construction of drainage facilities in waterlogged areas especially the neighboring mountain areas and pump water in the waterlogged areas when suffering from heavy rain. Fig. 11(a). Waterlogging for a = 5. Fig. 11(b). Waterlogging for a = 30. Fig. 11(c). Waterlogging for a = 100.

Conclusions
This paper uses multi-source information fusion technology to deeply integrate meteorological information, geographic information, and municipal engineering information. ArcGIS is used to realize the hydrological analysis process of horizontal flow after rainfall is transformed into surface runoff. Based on a tradeoff between the efficiency and accuracy of the calculations, 596 irregular sub-catchments were defined, giving rise to a macroscopic analysis of urban rainstorm waterlogging. Comprehensive consideration of rainfall intensity, rainfall duration, DEM, building height, slope, aspect, current land use, and regional drainage capacity to perform urban flooding prediction and risk identification. This research provides emergency management authorities with timely and quantitative decision-making information by reasonably identifying risky areas to release early warning signs and reduce casualties and property losses caused by urban waterlogging. For the traffic planning department, it is possible to predict the stagnant water area in advance and release information on the roads affected by disasters for citizens in real time. For the navigation service providers, the navigation algorithm can be improved to avoid the stagnant water section in advance and create a more reasonable and efficient route plan. It can also provide disaster assessment references for emergency dispatching departments, assess the accessibility of rescue services in important areas, optimize the layout of rescue stations and the rescue force dispatch plan.