Identifying urban form typology of residential areas in major cities in South Korea using clustering

. In South Korea, cities have experienced rapid development, resulting in diverse urban form patterns. While the typology approach has emerged for identifying different patterns for better understanding of urban development, typology studies are still lacking for Korean cities. This study identifies and compares urban form typologies for residential blocks in major Korean cities using clustering. Two cities are analysed which represent distinct regional city characteristics and planning themes in Korea: Seoul and Jeju. In each city, physical form data are collected in the Geographic Information Systems (GIS) format for calculating and analysing residential blocks. Urban form variables and principal components were analysed and used for K-means clustering. The results of clusters are then interpreted as urban form typologies. The identified urban form typologies in the two cities show the similarities and differences of typologies between the cities providing insights in the influences of regional characteristics, such as natural environment, culture, and of the planning patterns on urban form development patterns. The findings provide a better understanding of diverse urban forms in the three cities and their different local identities. The typologies can be utilized as references for urban and policy makers for sustainable planning and design.


Backgrounds
As the world experiences rapid urbanization, over half (54.4 percent) of the world's population lives in cities.The term "megacity" is used to describe urban agglomerations with over ten million inhabitants [1].Major cities in South Korea have undergone significant urbanization within a short period, driven by rapid economic growth and industrialization.The radical development has brought about a transformation in housing lifestyles.The substantial influx of people from rural areas has created an unprecedented demand for housing.As a result, apartment complexes have rapidly multiplied, becoming the predominant housing culture.However, the proliferation of such indiscriminate residential developments has given rise to social phenomena, including an educational fervor and soaring housing prices in specific areas.Consequently, these phenomena have exacerbated socioeconomic disparities between affluent and underprivileged neighborhoods, thereby shaping distinct residential typologies across the country.
This study aims to analyze thwo cities, namely Seoul and Jeju, based on their regional characteristics encompassing environmental and social aspects.Cities were selected because of their remarkable urban features compared to other cities in South Korea.Seoul as the nation's capital, leading the urbanization trends in Korea, is indispensable subject for analysis in urban planning and development.Jeju is an island city that has unique urban characters influenced by its volcanic topography, natural environments, and tourist industry.This study aims to analyze and establish the K-city typology based on their characteristics.The results indicate how the capital city and island city have different types in residential area depending on regions.

Typology-driven approaches for urban studies
Urban form and human settlements have been transformed according to the developments of the city, which enabled cities to build their characteristics and patterns from the urban components.Urban form can only be understood historically since the elements of which it is composed undergo continuous transformation and replacement [2].The features of each city have been set in a specific type and are helpful to understand cities and their evolution.
Urban Form typology is the approach that classifies urban form elements into types based on their characteristics.It is close to urban morphology which is a form-based classification focusing on the physical form of the city and morphological patterns of the urban components.Morphological structure describes urban elements as physical features of cities [3].However, the typological approach focuses on defining types through identifying and grouping process according to the similarities and differences of the urban elements.Urban form typology is an important concept in urban morphology studies to understand spatial structures and urban development evolution.Typology concept is helpful for classifying distinctive urban areas and identifying different characteristics [4].Despite the limitations of urban scales based on the large study area and complicated urban forms, typology-driven approaches have been recognized important because the context-sensitive planning is based on geographical regions [5].This study addresses several significant inquiries: Do residential areas in South Korea exhibit distinct typologies at the urban scale?Do these typologies vary based on different regions?If so, how can we visualize and define them?How do these typologies relate to urban planning?
Each city was analyzed with the collected variables in the Arc GIS format.Block is the basic unit for the study providing average values for the variables.Every variable was quantified, calculated, and aggregated within each block.The resulting values, representing each block, were then subjected to analysis using the Kmeans clustering method.The results of clusters were interpreted using domain knowledge as urban form typologies.These typologies are further compared across the two major cities to understand their urban form patterns.The findings of this study are expected to have significant implications for future development, policymaking and urban analysis.

Study areas
Seoul, the capital city in South Korea, occupies 605.2km2 and stands out as one of the remarkable megacities worldwide.It is characterized by its high population density and advanced infrastructure.Geographically divided by Han River, the southern part of Seoul was strategically planned to distribute the population and foster balanced development.Particularly, the transformation of the Gangnam area had a ripple effect on the surrounding boroughs, ultimately recognized as the three major boroughs in Seoul today: Gangnam-gu, Seocho-gu, Songpa-gu.These boroughs have gained renown for their high-cost apartments, urban density, and well-equipped infrastructure.
Jeju city serves as the special self-governing province located on Jeju Island.Encompassing an area of 1,849 km2, the city owes its formation to the volcanic activity of Halla Mountain, resulting in a gentle five-degree incline near the mountainous regions.Over the past decade, significant changes have occurred in the housing landscape of Jeju.Notably, Jeju has the lowest proportion of apartments among Korean cities, accounting for only 32 percent of the housing stock.In contrast, single houses make up a substantial 42.8 percent, with common apartments comprising merely 13.1 percent, the lowest among all cities.As of 2017, the housing landscape of Jeju residents indicated that 50 percent resided in single houses, with apartments experiencing a 25 percent increase, signifying that one out of four households now reside in an apartment.The availability of single houses continues to increase, maintaining a consistent proportion of over 30 percent.However, the number of unoccupied houses stands at approximately 280,000, accounting for 12.9 percent of the entire housing stock.

Data collection
This study was conducted with the aid of a GIS program for spatial analysis.Each shapefile of urban form variables were collected from National Spatial Data Infrastructure Portal, National Geographic Information Institute and city council.Due to the out of date data, the scope of study area had to be limited.However, the secondary sources provided by GIS were useful to analyze for clustering.This study focused on residential areas and aimed to analyze urban types in the scale of block units.Each unit consists of building, green area, and open space components.Each block comprises variables related to residential buildings and natural environments, encompassing various urban form elements.Every block is composed of five key elements: building area, total floor area, building height, green area, and block area.In ArcGIS, each elements were calculated with the function of Summary Statistics to get the sum and average values.Those elements were converted into cover ratio, floor area ratio, green area ratio for further analysis.Water body area is not usually overlapped within the block boundaries, which was hard to calculate in a block unit.

Clustering method
The calculated four variables, Cover Ratio, Floor Area Ratio, Green Area Ratio, and Block Area, were finally used for clustering analysis.Clustering is sensitive to outliers, therefore the range designation was needed.The ratio can not be exceeded 100, so every ratio variable is limited to 100.Block area variable was limited to 60000m2 to subtract residential buildings which are located on the enormous nature area.After limiting the maximum value of varibles, StandardScaler was used for standardizing the features by removing the mean and scaling to unit variance to avoid the dispersion of data.This data cleaning method enhanced the clustering.To get a proper K number, Silhouette Score form sklearn.metrics was used.When the range of cluster number is set, the Silhouette Coefficient calculated using the mean intra-cluster distance and mean nearest-cluster distance for each sample.The best value is 1 and the worst value is -1.This study set a range of k cluster numbers from 2 to 31 to get a precise value and got about maximum for 0.67223984 and minimum for 0.29456778 (similar range of both cities).The best k for cluster was defined 4 for Jeju (5 for Seoul) with the highest average Silhouette Score.The specified number of clusters, value as 4(5 for Seoul), was substituted in the place of K for the clustering.KMeans from sklearn.cluster was used to sample the data.KMeans algorithm clustered the dataset by trying to separate samples in 4(5 for Seoul) groups of equal variances.For visualizing the analysis, the command of Centroids was put for cluster summary and describtion.

Fig. 2. Number of clusters by silhouette score
The aggregated data from ArcGIS was run in Spyder to get the proper K number of clustering.The scaled data were analysed by silhouette Score with the range from 2 to 31 results value 4 as the best K-menas clustering number for Jeju and 5 for Seoul.Cover Ratio, Floor Area Ratio, Green Area Ratio, and Block Area variable were distributed by cluster (Fig4 and 5. From left to right).The results represent that cover ratio and floor area ratio are assembled around the lower value, which indicates Jeju has lower-rise buildings in small area, whereas there are diverse size of blocks including a wide range of nature areas.On the other hand, Seoul residential buildings have more various cover ratio and floor area ratio compared to Jeju but has less variation on block sizes.Cluster 0, the first left image of deep blue, indicates that has high value of cover ratio and floor area ratio.Cluster 1, the navy colored one, has the lowest mean of value, cluster 2, 3, and 0 followed.The result indicates the biggest block area with a few of buildings has the lowest mean, which means the large block area has less buildings.

Conclusion
This study uses clustering to identify urban residential form typologies in two remarkable cities in Korea that have distinctive regional characteristics.The study can contribute on urban studies in Korea for following reasons: There are less research on urban typology in Korea focused on residential areas.Furthermore, it is meaningful that analyzing urban forms through quantitative method and resulting the types by using Python.Cities in South Korea has different development levels, therefore other cities should be analyzed in further studies.The findings provide a better understanding of diverse urban forms in the two cities and their different local identities.Typology-driven research can be utilized as references for urban planners and policymakers for sustainable planning and design.

Fig. 3 .
Fig. 3. Plot of clustered groups of Jeju by PCA PCA (Principal Component Analysis) was needed for dimensionality reduction to present 4 components(variables) analysis in 2 dimensional graphs.PCA identifies the axes in the dataset that maximize variance.It helped to eliminate noises and allowed for effective visualization in linear relationship of various components.

Fig. 6 .
Fig. 6.Typical blocks of clustering in Jeju City.From cluster 0 to 3. From left to right.

Fig. 7 .
Fig. 7. Typical blocks of clustering in Seoul City.Seoul has 5 clustered results, but less variation between the variables.Cluster 0 occupies large area and values because the large blocks were included.Cluster 4,3,1, and 2 followed with well-balanced proportion.Comparing to Jeju, Seoul is more evenly developed in every residential block.

Table 1 .
Description of urban form variables.