Research on the difference of Digital inclusive Finance--based on Multi-index Panel data clustering

Objective: to understand the development level of digital inclusive finance in 31 provinces in China in recent years, so that the areas with poor development level can speed up the development. Methods: the data of Peking University Digital inclusive Financial Index from 2012 to 2018 were collected, and the optimal clustering number was determined, and then the cross-sectional data and multi-index panel data were clustered respectively. Conclusion: the level of digital inclusive finance in China shows an upward trend as a whole, but there are great differences in the development level of digital inclusive finance among 31 provinces in China, in which East China is the best, South China and Central China have a better overall development level, and North China, Northwest, Southwest and Northeast are poor in overall development level, but have provincial differences.


Introduction
The Fourteenth five-year Plan in 2020 proposes to speed up the digital planning and construction of digital China. Digital inclusive finance is to digitize inclusive finance and provide more convenient financial services for the people. The digital inclusive financial index is a comprehensive index reflecting the development level of digital inclusive finance. From 2012 to 2018, the level of digital inclusive finance in China is on the rise as a whole. In order to speed up the development level of digital inclusive finance, this paper focuses on the regional differences of the development level of our country, in the hope that the government can take more measures to make up for the shortcomings, reduce the regional differences, and improve the development level of digital inclusive finance as a whole2.
However, the research on digital inclusive finance mainly focuses on the development process of digital inclusive finance, the measurement of indicators and the impact on economy, innovation, enterprises and so on. Few scholars have studied the regional differences of digital inclusive finance, but understanding regional differences can play a positive role in the development of digital inclusive finance. Most of the existing researches on the differences of digital inclusive finance in different provinces in China use the data of a certain year for cluster analysis, although they can reflect the differences in the level of digital inclusive finance in different provinces in that year, but they are lack of time dimension. It does not reflect the comprehensive level difference of the development of digital inclusive finance in China in recent years. Multi-index panel data includes three dimensions: time, sample and index, which can better and more comprehensively reflect the development of digital inclusive finance in our provinces in recent years. Therefore, this paper applies multi-index panel data clustering to digital inclusive finance.

Multi-index panel data structure
Panel data is a data type that integrates cross-sectional data and time series data. It has two dimensions: time series and cross-section, which is divided into singleindex panel data and multi-index panel data. The single index panel data is similar to the cross-section data and can be represented by a two-dimensional table. The structure of multi-index panel data is more complex than that of single index panel data, and it is a kind of threedimensional structure data, which can be transformed into multiple two-dimensional data representations on the plane. The representation of multi-indicator panel data is shown in Table 1: Table 1. Multi-indicator panel data , , for i=1,2,3…t j=1,2,3…k….m n=1,2,3…s where i,j and n represent time, sample and index respectively

Digital characteristics of multi-index panel data
(1) The mean of sample k in time i is The variance is 1 1 , , 12 3 Clustering method

Determination of clustering number K value
Clustering algorithm belongs to unsupervised learning, and the clustering number needs to be determined artificially. Too much or too little clustering number will have a negative impact on the clustering results. The optimal K value can be determined by the elbow method. When the number of clusters is less than the optimal number of clusters, increasing the value of k will greatly reduce the SSE; when the number of clusters is greater than the optimal number of clusters, although the more the number of clusters, the degree of aggregation will increase, but the extent of the decrease of SSE value will decrease sharply and gradually tend to be parallel. Sum of squares of error: From Figure 1, we can see that the digital inclusive financial differences in different regions can be best divided into four categories.

Systematic clustering method
Clustering methods include K-means clustering, fuzzy clustering, dynamic clustering and clustering combined with intelligent algorithms5.Systematic clustering is the most widely used clustering analysis method, which mainly includes the distance between points and the distance between classes. The distance between points includes Euclidean distance, Chebyshev distance, Ming's distance, absolute distance, and the distance between classes includes the shortest distance method, the longest distance method, the center of gravity method, and the sum of square deviation method. Considering the characteristics and distribution of the data, the Euclidean distance is selected to describe the distance between samples, and the Ward method is used to measure the distance between classes.
(1) Euclidean distance: , ∑ (2) Ward method: Suppose it is divided into n samples into , , … … represents the i sample in class , represents the number of samples in class , is the center of gravity of . The sums of squared deviations of is The sums of squared deviations within the whole class is 14

Multi-index panel data clustering
With the addition of the time dimension, the distance between the points becomes the distance between the cross section and the cross section, and the Euclidean distance is extended, and the distance between the faces is also the sum of the distances of all the points in the face.
VCED between sample and is , ,

Clustering analysis
This paper uses the data of digital inclusive financial index of Peking University, and the selected indicators are "breadth of digital financial coverage", "depth of use of digital finance" and "degree of digitalization of inclusive finance"3.The index is the first-level index calculated by the Institute of Digital Finance Peking University based on 24 specific indicators. Based on the digital inclusive financial index data of each province from 2012 to 2018, the annual digital inclusive financial level of 31 provinces in China is clustered by systematic clustering method, and then the digital inclusive financial level of each province for 7 years is comprehensively considered. the multiindex panel data clustering method is used for clustering. In this paper, the software used for the clustering of crosssectional data is python, multi-index panel data clustering, and the software used is R. The clustering results are shown in Table 2 and Figure 2 below. As shown in figure 2, the digital inclusive finance of 31 provinces in China can be divided into four categories. The first category is that the areas with poor development level of digital inclusive finance are Tibet, Qinghai, Gansu, Guizhou, Xinjiang, Hebei, Jilin, Yunnan, Henan, Heilongjiang, Guangxi, Inner Mongolia, Ningxia, Shanxi, Shaanxi. The second category of digital inclusive financial development level of general areas are Hainan, Chongqing, Liaoning, Shandong, Anhui, Jiangxi, Hunan, Sichuan; the third category of digital inclusive financial development level of Tianjin, Hubei, Jiangsu, Fujian, Guangdong; the fourth category of digital inclusive financial development level of the best areas are Beijing, Shanghai, Zhejiang. As can be seen from

Conclusion and suggestion
In this paper, the development level of digital inclusive finance in 31 provinces is basically divided into four categories according to the breadth of digital financial coverage, the depth of digital financial use and the degree of digitalization of inclusive finance. It is not difficult to see that the regional differences of digital inclusive finance in China are large, and there are few welldeveloped provinces, and most provinces, especially in the western region, have a low level of development. How to improve the overall level of digital inclusive financial development in China according to the influencing factors such as the level of economic development, population density and Internet coverage in these four types of areas, reducing regional differences is an important content that our government, enterprises and academic circles should pay attention to. This paper hopes that through the cluster analysis of the development level of digital inclusive finance in various regions of the country, each region can have a clear understanding of its own positioning, and make the connections and characteristics of similar areas more clearer. Let people from all walks of life pay attention to these areas. At the same time, the government should also make great efforts to make up for its shortcomings so as to speed up the development of areas with a poor level of digital inclusive financial development.