Spatial Association characteristics of facilities around scenic spots considering distance and orientation: A case study of 3A and above scenic spots in Beijing

In the urban tourism and service industry, the POI data with coordinate and attribute information of the major map platforms constitute one of the important data sources of the urban tourism and service industry. In this paper, the spatial data transaction database under four distances was established based on the gate buffer of 3A and above scenic spots in Beijing. The Apriori algorithm was used to calculate the lifting degree to obtain the distance for mining the best association features of 3A, 4A and 5A scenic spots, and then the association features of the three scenic spots in different directions were analysed.


Introduction
With the acceleration of modernization process and the development of mobile portable devices, LBS acquired by mobile devices provides massive data, which provides a strong theoretical basis and data support for the mining of spatial knowledge and spatial relations hidden in spatial data. Spatial association rules (SAR, Spatial Association Rules is an important part of Spatial data mining. Many researchers have carried out a lot of urban calculation work based on big data, using POI data, floating car track data and microblog check-in data to conduct research on the division of urban functional areas [1], travel characteristics of residents [2], urban vitality [3] and other aspects. After the concept and algorithm of association rules [4] were proposed, scholars conducted a large number of researches, which mainly focused on the application field and the mining algorithm. From the perspective of application fields, studies are mainly focused on emergency events [5], environment [6], and urban functions [7]. In terms of mining algorithm, researches mainly focus on optimizing algorithm efficiency by using vertical data format [8], binary constraint [9], removable window [10] and other conditions. However, there are few research examples of spatial association analysis of urban facilities, and the current research is mainly based on the fuzzy perspective to study the association characteristics of urban services. In this paper, using POI point data of urban public facilities and considering the influence of distance and orientation, the spatial association characteristics of facilities in urban scenic spots are analysed. The main research contents are as follows : (1) Calculate the average promotion degree and analyse the optimal distance for mining the optimal association characteristics in different levels of scenic spots. (2) Calculate the similarities and differences of the association features of different levels of scenic spots in different directions and corresponding distances.

Overview of the study area and data acquisition
In this paper, the spatial association characteristics of 3A and above scenic spots in Beijing were analysed. As shown in the figure1, as of June 26, 2020, Beijing has a total of 149 scenic spots, including 72 3A-level scenic spots, 56 4A-level scenic spots and 8 5A-level scenic spots. In addition, the POI data of different operators offer abundance, there is a difference, to the POI classification system is not the same, Scott map as one of the most popular maps usage, this article from Beijing university open data platform for the scope of gold map of Beijing in 2018, a total of about 850000 POI data, selected from 29 class POI data contains the public service facilities, And about 420,000 taxi OD data on a rest day for experimental analysis.

Overview of the study area and data acquisition
Association analysis is one of the core technologies of data mining. Association rule model and data mining algorithm were first proposed by Agrawal in 1993 [11]. The goal is to find interesting associations or interrelationships between projects in large amounts of data. The most classic case of association rule data mining is the story of "Beer and Diaper" in Wal-Mart. By mining the association of various goods in customers' "shopping basket", customers' shopping habits can be analysed to help merchants make better sales strategies. In basket mining, an item is an item, a collection of multiple items is called an item set, and each purchase record is called a transaction. Association rules are generally written in the form of X => Y, where the left item set X is the prerequisite and the right item set Y is the corresponding association result, which is used to represent the implied association in the data. For example, {scenic spot} => {parking lot}, indicating that the two spatial entities, scenic spot and parking lot, have a certain association within the distance threshold. For the intensity of association rules, whether scenic spots and parking lots are more concentrated or social security agencies and bus stations are more concentrated. The evaluation is mainly carried out by the concepts of support, confidence and lift [12]. Support refers to the possibility of appearing in all item sets {X, Y}, namely the probability of containing both X and Y, as shown in Equation (1).
(1) Where, Support(X→Y) is the degree of Support, and the probability that P(X, Y) item set contains both X and Y. Support is the first requirement for mining strong association rules. Its significance lies in eliminating meaningless rules with low probability by setting minimum threshold, and reserving items with relatively frequent occurrence as spatial frequent item sets. The above example data is taken as an example. Assuming that the minimum support is 0.05, {parking lot, Chinese restaurant} =80/1000=0.08, {parking lot, convenience store} =10/1000=0.01. Since {parking lot, Chinese restaurant} meets the minimum support requirements, it is retained as the frequent item set. At the same time, the spatial association rule parking lot => convenience store is retained, while the rule corresponding to {parking lot, convenience store} is eliminated.
Confidence refers to the probability of occurrence of association result Y under the condition of occurrence of prerequisite X of association rule, that is, the item set containing X contains the possibility of both containing Y, as shown in Equation (2).
(2) Where, Confidence(X→Y) is Confidence; P(Y│X) is the probability of occurrence of the corresponding association result Y under the condition that the precondition X of the association rule appears; P(X) is the probability of Y. Confidence is the second requirement for generating strong association rules, which measures the reliability of association rules. It is also necessary to set the lowest confidence level to continue filtering. In the above example, when the minimum confidence is set as 0.6, the confidence of parking lot => Chinese restaurant is 80/100=0.8, and this rule is retained; the confidence of parking lot => convenience store is 10/100=0.1, and the rule is eliminated.
The promotion degree refers to the ratio of the possibility of both containing Y under the condition of containing X to the possibility of the item set containing Y without such a condition, namely, on the basis of the occurrence probability P(Y) of Y itself, the promotion degree of the occurrence probability P(Y│X) of Y by the appearance of X, as shown in Equation (3).
(3) In the formula, Lift(X→Y) is the lifting degree; Lifting degree can make up for the deficiency of confidence and can be regarded as a complementary index of confidence. When the degree of promotion value is 1, it means that instance X and Y are independent, and the occurrence of X has no effect on the possibility of occurrence of Y. The larger the degree of promotion value (confidence >1) is, the greater the degree of promotion of X to Y is, that is, the stronger the spatial association is.
The association analysis of tourist attractions and public service facilities can be similar to the analysis of shopping basket. Each facility type is equivalent to the goods in the shopping basket, and the facility type in each unit is equivalent to a shopping record.

Results & Discussion
In this paper, the Apriori algorithm is used to analyse the tourist attractions in Beijing at four distances of 100, 150, 250 and 500 meters. The minimum support is set as 0.2 to calculate the average promotion degree of association rules of the three types of scenic spots at different distances, as shown in Figure 1. It can be found that the 3A and 4A scenic spots have the highest average lift degree at a distance of 150 meters, and the 5A scenic spots have the highest average lift degree at a distance of 250 meters. Therefore, 150 meters is the best distance to excavate the association characteristics of the 3A and 4A scenic spots, and 250 meters is the best distance to excavate the 5A scenic spots. In the distance, on the basis of in this paper, three kinds of scenic area according to the north and south, east and west four different bearing characteristics of the spatial association analysis, through many experiments, the 3 a level scenic spot in the 150 -meter distance, the select 0.1 as the minimum support degree of the east and the south, 0.2 as a scenic spot in west and north minimum support degree, minimum confidence level of 0.6. Due to space constraints, only the first five items in each direction are listed in the table.  In the distance of 150 meters, the overall support degree of association rules is at a low level. Strong association rules related to parking lots are found in 3A-Level scenic spots in all four directions, indicating that their confidence degree is above 0.9. In the east direction, {social security agencies, 3A-level scenic spots} => {parking lot} is taken as an example. The probability of parking lots within 150 meters of 3A-level scenic spots and social security agencies in Beijing is about 22%. It can be seen from Table 1 that in the eastern position of the 3Alevel scenic spot, the association features are mainly social security institutions, parking lots and medical facilities, while in the eastern position, the social security institutions are relatively complete. In the west, the related features are mainly medical facilities, parking lots, Chinese restaurants, and car services. The appearance of car rental and maintenance services indicates to some extent that car rental and maintenance services will be provided for self-driving tourists at the west gate of the 3A-level scenic spots. In the south direction, the association features are mainly Chinese restaurants, parking lots and medical facilities. Different from the other locations, the northern location attracts the characteristics associated with the pick-up and drop-off points of taxis, and the northern location is more likely to attract tourists to take taxis in the four locations of the 3Alevel scenic spots. As shown in Table 2, in the four directions at a distance of 150 meters, the northern position has the highest overall support, followed by the south, the west, and the east. According to the analysis of sub-direction, under the eastern position of the 4A level scenic spot, the associated features are mainly public toilets, social security organs, parking lots, scenic spots and other related features, while under the eastern position, the scenic spot itself attracts more facilities. In the western bit, {hostel}=> {Chinese restaurant} is the highest association rule, which shows strong association characteristics. In the south, bus stations, parking lots, public toilets and scenic spots are the main associated features, which also show that the scenic spots attract other facilities. Similar to the 3A spots, the 4A spots are also more attractive to taxi pickup points in the north. As shown in Table 3, the support degree of association rules generated in the four directions is at a high level and the gap is small. The support degree of the south direction is the highest, followed by the west and north and south directions, and finally the east direction. The 5A scenic spots show strong association characteristics of Chinese restaurants, bus stations and parking lots in the east, west and south directions. Different from other level scenic spots, the characteristics of the highest association rules are less attracted by 5A level scenic spots and more attracted by each other's public facilities, resulting in the phenomenon of aggregation.
The 3A-level scenic spots will be taken as an example to illustrate the difference of spatial association characteristics in different directions. This paper used arulesViz visualization package drawing association rules as a result, the horizontal axis LHS (left) the rules of the first two facilities said in the group of the most frequent two types, the vertical axis RHS (right) and the left side of the associated facilities. The size of the circle indicates the degree of support, and the shade of color indicates the degree of improvement.

Conclusions
In the face of a large number of data based on spatial location produced in the era of big data, mining of spatial association rules is an important part of urban research. This paper uses POI data and taxi data to calculate the association characteristics among facilities and study the spatial distribution among industries. In each city, facilities entities generally have symbiosis, connection and other topological relations. Apriori algorithm, as a classic spatial association rule mining method, can well mine the spatial association characteristics of urban facilities, indicating the simultaneous occurrence of facilities and the frequent simultaneous occurrence of facilities. The practical significance of the results in this paper is as follows :(1) For tourists, the association features of taxi trips in the results can provide some reference for tourists when taking a taxi trip, and the features of illegal parking signs can make self-driving tourists carefully consider when choosing appropriate places to park their cars. (2) For scenic spots, the experimental results also reflect the problem of unbalanced distribution of public service resources in scenic spots. From downtown to suburban areas, the abundant allocation of public service resources in scenic spots gradually decreases. The experimental results provide reference for urban management departments to reasonably arrange scenic resources and meet the needs of tourists.