The Analysis of Potential Residential Area/Center Based on Residents’ Travel Behaviour and Planned Metro Stations

Metro is of vital importance in public transportation system. Recent studies have examined the influence of metro systems by various methodologies. However, few of them has focused on the stations which are planned to be built or still being built. Therefore, this study intends to evaluate the future metro stations and map the potential urban residential center, based on analyzing the metro card data of the existing metro systems. Based on a case study in Shenzhen, China, we identified 21 residential hot stations and 13 working hot stations. Also, the results indicate that most passengers have a travel length between 5-14 stops, while each residential center has its specific working center. Moreover, when the housing price decrease 1598.3 RMB per square meters, residents may be willing to move to a place with one more stop commuting time. Finally, based on two criteria established by the riding behavior, 67 new stations are found to have the chance to be new residential centers in the city. The strategy proposed in this study can help urban planners to understand the possible influences of new metro stations and assist them to do the planning work in a more appropriate way.


Introduction
Metro system is an important means of public transportation. Especially, in big cities where citizens usually suffer from serious traffic congestion, the metro service is preferred because of less delay and low cost. The transportation convenience caused by metro stations can drive residents to live far away from their working locations, to pursue lower residential cost. Therefore, the construction of new metro stations may change the spatial distribution of urban population and help the city to form new residential center.
To scrutinize the possible influence of metro stations on the city, the characteristics of exiting metro systems are examined in many previous literatures. Several studies have classified the stations into residential, commercial, transportation hub, using indicators such as volume of passengers, land use type of the station and so on [1]. Moreover, the behaviour of travelling by metro has been examined by either theory such as maximum utility theory and prospect theory [2,3], or by questionnaires survey [4].
However, these traditional methodologies of studying metro systems have various problems. First, only the function of existing metro stations is studied, while the stations planned to be built or still being under constructions are rarely referred. Second, the data collected by questionnaire or extracted from urban land use map may be bias and have time lags.
Recently, with the universal use of metro card, the behaviour of metro passengers has been recorded. The information obtained by metro card has many advantages.
For instance, it contains temporal information, has large coverage, and can be obtained in low cost [5]. Therefore, various studies utilized such data set to describe the spatial-temporal characteristics of metro system [6][7][8], examine the spatial structure of the city [9,10], or make strategies to optimize the schedule of metro system [11].
In this study, we intend to evaluate the future metro stations and map the potential residential center of the city, based on analyzing the metro card data of the existing metro systems. The strategy proposed in this study can help urban planners to understand the possible influences of new metro stations and assist them to do the planning work in a more appropriate way.

Study Area
Shenzhen, a city located along the Southern Coast of China, has a population of 12.5 million in the year 2017 based on the "Shenzhen statistical yearbook 2018". This huge population has brought great pressure to the public transportation.
Based on the "Transportation Construct Planning" published by the government, 167 metro stations have been already in operation, while another 141 stations are still being built or planned to be built in the year 2018. Fig.1 delineate the distribution of residential population of the city. Very high density of urban residents can be observed around the existing metro stations, mainly because of the transportation convenience.

Data Set
The data adopted in this study is collected between October 1 to October 31 in the year 2017, for all the passengers that used metro card to commute. The data records the total amount of people that enter a station from seven in the morning to 11 in the evening, and also the corresponding amount of population that go out of each station.

Recognizing hot stations
As it discussed in 2.1, the hotspots of the cities are all located around the metro stations. Therefore, the hot stations can represent the hotspots of the city. To recognize these hot stations in the city, two steps should be done. First, the rush hour for the metro transportation should be identified. Second, the busiest stations in the rush hours should be selected. Fig.2 represents the total amount of passengers in a day for all stations. By observing the change of passenger volume, it could be seen that the morning rush time is around 7 am to 10 am, and the afternoon rush time is around 4 pm to 9pm.
We then calculate the statistical data of different stations during the rush hours to find out hot stations. In morning rush hours, the average amount of people entering each station is 5141, and the standard deviation is 5261. Therefore, the stations with volumes of entering passengers larger than one standard deviation (10402) during morning rush hours are defined as residential hot stations.

Figure 2. Amount of Passengers in A Day
On the other hand, the average amount of people getting out of each station during rush hours are calculated, the average amount is 6240, and the standard deviation is 4932. The working hot stations are identified with this outing volume of passengers larger than one standard deviation (11173). Finally, 21 residential hot stations and 13 working hot stations are identified.

The Commuting Pattern
Metro is favoured by urban residents since it is fast and is less affected by traffic jams during rush hours. However, when their home is located near the working place, they have many choices to get to work, such as taking a bus or bicycling there. On the other hand, people would not choose to live in a place too far away from the working place, even if the metro service is available.
By analysing he shortest path of each two stations using ArcGIS10.5, the total amount of passengers that go thorough different amount of stations are delineated in Fig.3. It could be seen that, the number of passengers began to increase at 5 stops, and few passengers would ride metro to work with more than 14 stops. Furthermore, to get other detailed information about the commuting mode, the start-end stations with number of passengers larger than 1000 during rush hours are delineated in a map, as shown in Fig.3. The results indicate that, in a poly-center city like Shenzhen, each working center region is assigned to a specific residential center which is 10-20 miles away from it.

Relationship between Distance and Housing Price
In a city, people are prone to live near their working place, However, the living cost near working place are extremely high. The housing price near the working hot stations are over 65000 RMB (9000 US dollars) per square meters. Therefore, urban residents choose to live in a remote region, in case of decreasing their cost on house. Fig.5 shows that, residents may take more stations to commute when they live in a region with cheaper houses. Meanwhile, the average stations a passenger going through is highly related to the housing price, which has a r value of 0.715. Furthermore, we calculate the price difference between the starting station and the ending station. As shown in Fig.6, this difference increases as the number of stations increase, and begin to decrease when the number reaches 30.
Because most passengers take metro to work with a range of station numbers between 5-14 stops, a linear regression model is established between the difference of housing price and the number of stations passengers going through. The model has an r2 of 0.53 and the significance level of the model is 0.001. The model can be described as: 1598.3 6942.7 5 15 where represents the number of stations and represents the price difference for the starting and ending stations.

The Identification of Potential Residential Center
Based on the analysis results in 3.2 and 3.3, two criteria have been established to identify whether a future station has the potential to be a new residential center.
1. The station is within 5-14 stops distance to the existing working stations.
2. The housing price within 2km buffer of the station is lower than at least one working hot stations with a difference of larger than . When the new station can satisfy the above two criteria, lots of urban residents is likely to move there, to enjoy a lower living cost.
By applying them to the stations that are planned to be built, we found that 67 has the potential to be a new residential center. As shown in Fig.7.

Conclusion
In this study, the metro card data has been adopted to analyse the metro riding behaviour of urban residents in a city. By a case study in Shenzhen, China, first 21 residential hot stations and 13 working hot stations have been identified. Also, we observed that most passengers have a travel length between 5-14 stops, and each residential center has its specific working center. Moreover, when the housing price decrease 1598.3 RMB per square meters, residents may be willing to move to the cheaper place, with one more stop of commuting time.
Based on the riding behavior we discovered from the metro card data, it is inferred that, 67 new stations may have the chance to become new residential centers in the city. The results of this study can be used to forecast the future spatial structure of a city based on the public transportation system. However, the research has several limitations. First, because of the data limitation, only the number of stops is used to calculate the travelling cost, and the actual travel time has not been considered. Meanwhile, the crowdness of the station has not been discussed, which may influence residents' choice to commute by metro. Second, only the housing price is used to describe the living cost, while the rents of each region, and the distribution of affordable houses provided by the government are not included. Third, this research only tells which station maybe the potential residential center, without modelling how many people would settle down in these regions and discuss the future distribution of population in the city. In the future studies, investigation data can be combined with these metro card data to fill these gaps.