The Application of RBF Neural Network Optimized by K-means and Genetic-backpropagation in Fault Diagnosis of Power Transformer

. Abstract. Through the dissolved gas analysis (DGA) in transformer oil, the fault of the power transformer can be diagnosed. However, the DGA method has the disadvantage of low accuracy because it couldn't exactly reflect the nonlinear relationship between the characteristic gases and fault types. Radial basis function neural network (RBFNN) has the advantage of dealing with complex nonlinear problems, so it can be applied to transformer fault diagnosis based on DGA. The centers, widths and weights has important effects on the performance of the RBFNN. However, it is difficult to find the global optimal solution of these parameters when RBFNN training. This paper creatively designs a method to improve these parameters of RBFNN, firstly using the K-means algorithm to optimize the centers and widths of RBFNN, then using the genetic algorithm-backpropagation (GA-BP) algorithm optimize the weights. Finally, establish the K-means RBF-genetic backpropagation (KRBF-GBP) algorithm model through a large amount of training data. The test results show that the fault diagnosis accuracy of the KRBF-GBP algorithm is 96.4%, higher than the unoptimized RBFNN with 71.43%.


Introduction
Power transformers are an essential part of the power system [1]. Dissolved gas analysis (DGA) is a method of transformer fault diagnosis by analyzing the composition and content of dissolved gas in transformer oil. Compare with the common outage maintenance of power equipment, DGA method will not damage the insulation of the equipment, and it has the advantages of simplicity and convenience [2]. So this method is one of the most effective methods to judge the faults of oil-immersed power transformers [3].
DGA method is used to judge the fault type by known dissolved gas content. However, the corresponding relationship between dissolved gas content and fault type is complex, which is nonlinear, and DGA method does not fully reflect this relationship between them [4]. Therefore, the accuracy of fault diagnosis use DGA method is low. For example, the diagnostic accuracy of the Key gas method which is one of DGA method is only 43.34% [5]. Radial basis function neural network (RBFNN) is an excellent forward network [6], it has strong advantages in processing complex nonlinear mapping problems, the hidden layer transforms the input vector from the low dimensional space into the high dimensional space, then the linear inseparability problem which in the low dimensional space is linear separable in the high dimensional space [7]. So it can realize the classification of fault very well and could improve the accuracy when applying it in transformer fault diagnosis based on DGA.
The centers, widths and weights are the main parameters of RBFNN, it has important effects on the performance of the RBFNN. However, it is difficult to find the global optimal solution of them [7]. Therefore, how to find the optimal solution of these parameters or optimize them is a worthwhile research topic.
This paper proposes a strategy to optimize the parameters of RBFNN based on DGA data that by using the K-means and the genetic algorithm-backpropagation (GA-BP) algorithm optimizes the parameters of the RBFNN, finally establish the K-means RBF-genetic backpropagation (KRBF-GBP) algorithm model. It can be seen from the test result that this model has a high fault diagnosis rate in transformers and has great application value.

Normal value of characteristic gases
A comparison with the normal value of the dissolved gas in oil can determine the presence or absence of fault. If any of the hydrocarbon gases exceed the contents listed in Table 1, the transformer is considered to be running abnormally [8].

Duval triangle method
Duval triangle method was proposed by Michelle Duval [9], and it is recommended in the latest IEEE and China National Standard Guide [10]. The Duval triangle method is based on the use of three gases CH4, C2H4, and C2H2 and their location in a triangular map [11]. For plotting the triangle, gases are transformed into triangular co-ordinates [12]. Thermal fault (t < 300℃) (T1), thermal fault (300℃ < t < 700℃) (T2), thermal fault (t > 700℃) (T3), low energy discharge (D1), high energy discharge (D2), partial discharge (PD), and a mixed region (D+T) are the seven detectable fault types.
The first step in applying the Duval triangle method is to calculate the percentage of the three gases and then find out the corresponding fault region in Fig. 1

Rogers ratios method
This method uses three gas ratios involving five gas concentrations, which are C2H2 / C2H4, CH4 / H2, and C2H4 / C2H6. The range of each gas ratio suggesting a particular fault type is given in Table 3 [14]. Table 3. Rogers ratios method [13] 3 RBF neural network and K-means

RBF neural network
RBFNN is an efficient feed-forward neural network, which has the best approximation performance and global optimal characteristics, and it's simple in structure and fast in training. Therefore, the RBFNN model can be widely used in pattern recognition, nonlinear function approximation, and other fields [15]. RBFNN contains the input layer, the hidden layer, the output layer. The structure is shown in Fig. 2.   Fig. 2. Structure of the RBFNN [16] In this paper, the Gaussian function is selected as the function of the hidden layer, the calculate method as shown in formula (1).
Where i  is the output of the hidden layer, x is the input value of the RBFNN, c x are the centers of the Gaussian function,  are the widths of the Gaussian function.
The outputs of the RBFNN is shown in formula (2).
is the output value of the RBFNN and i w are the weights between the hidden layer and the output layer. The output value of the RBFNN is the sum of products of i  and i w . c x ,  and w are important indicators to evaluate the performance of RBFNN.

K-means clustering algorithm (KMCA)
K-means algorithm is a kind of unsupervised learning algorithm that is often used in clustering analysis. Because of its simple working steps, high efficiency, it is widely used in the task of data mining and pattern recognition [17]. The principle of K-means algorithm is that for the input sample sets, data points are randomly generated as the initial centers, the euclidean distance between each sample point and each center is calculated, and each sample point is divided into the nearest cluster where the centers are located, the centers are recalculated, and the position of the sample points are updated iteratively continuously until the centers no longer change.
K-means algorithm can be used to adjust the clustering centers in the selection of the RBFNN centers to make the selection of the network centers more accurate [18].

BP neural network and genetic algorithm 4.1 BP neural network
BPNN is a multi-layer network, it generally consists of an input layer, one or more hidden layers, and an output layer. Adjust the weights and thresholds by propagating the error between the expected output and the actual output, achieve the training of BPNN [19]. Forward propagation of signals and back-propagation of errors is a complete learning process of the BPNN [20]. Fig. 3 shows a three-layer BPNN structure with a hidden layer.
Where, there are n neurons in the input layer, h neurons in the hidden layer and m neurons in the output layer. ij w is the connection weight between the jth neuron of the input layer j x and the ith neuron of the hidden layer, i b is the threshold of the ith neuron of the hidden layer. ki w is the connection weight between the ith neuron of the hidden layer and the kth neuron of the output layer k y , k a is the threshold of k y . Besides,  represents the activation function of the hidden layer and  represents the activation function of the output layer [21].

Genetic algorithm (GA)
Genetic algorithm (GA) is a kind of random optimization search algorithm that draws on the natural selection and genetic principle of the biological community. Its main characteristic is the group search strategy and the information exchange among the individuals in the group, the search does not depend on the gradient information [22].
The GA is different from the traditional search algorithm, which is based on a fitness function and realizes the iterative process search of individual structure reorganization in the population by carrying out the genetic operation on all individuals in the population. Selection, crossover, and mutation constitute three main genetic operations of the GA. Parameter coding, initial population setting, fitness function design, genetic operation design, control parameter setting, and other elements constitute the core content of the GA [22].

GA-BP algorithm
Because the BP algorithm is very sensitive to the initial weights and thresholds, different initial weights and thresholds may lead to different results and easily fall into local minimums and other problems [24]. So it can be compensated for the weak global search ability of the BPNN by the superior global search ability of GA [25]. The algorithm of optimizing BPNN by GA is called the GA-BP algorithm. The initial weights and thresholds of BPNN are optimized by GA to improve the learning convergence speed of the neural network so that the optimized BPNN can better predict the output of function [26].
The description of GA-BP algorithm.
Step 1. Input the sample data, and determine the structure of the BPNN (node numbers of the input layer, node numbers of the hidden layer, node numbers of output layer).
Step 2. GA (fitness evaluation, selection, crossover, mutation) is used to optimize the weights and thresholds of BPNN. The optimal individual is obtained.
Step 3. The optimal individual are taken as the initial weights and thresholds of the BPNN, and then trained it by the BP algorithm.
Step 4. Gets the error between the actual outputs and the expected outputs of the BP network. Determine whether the error meets the accuracy requirements. If meet the requirements, the algorithm ends; if not meet the requirements, continue to BP algorithm training. Fig. 4. The operation process of the GA-BP algorithm [27] 5 Methodology

Research design
This paper uses both quantitative and qualitative research methods.
The quantitative method is used in three aspects, which are, the DGA data analysis; the parameter setting of RBFNN, BPNN, and GA; and the calculation of the accuracy of the improved RBFNN algorithm.
The qualitative method is used in the classification of transformer fault types (T1, T2, T3, D1, D2, PD). There are many classification methods in this paper, which are, Duval triangle method, Rogers ratios method, and newly established the KRBF-GBP algorithm.

Data collection
The research data are collected from journal papers and dissertations related to the research. A total of 364 sets (31 sets for T1, 54 sets for T2, 125 sets for T3, 54 sets for D1, 80 sets for D2, 20 sets for PD) of DGA data as training sets, a total of 110 sets (15 sets for T1, 18 sets for T2, 31 sets for T3, 14 sets for D1, 20 sets for D2, 12 sets for PD) of DGA data as testing sets.

Data analysis
Step1. Before network training and testing, the collected data must be verified. As shown in Table 1, if the concentration of a set of data is all below the normal limit, then delete it. If the concentration of any gas is higher than the normal limit, then this set of data can be used for step 2.
Step 2. This research uses the Duval triangle method [13], Rogers ratios method [13], and the actual faults given by journals to verify the DGA data. There will be three diagnostic results, if two or three of them are the same, the transformer is considered to belong to this type of fault.

The design of improve RBFNN structure
From the view of network structure, between the hidden layer and the output layer of RBFNN is weight connection. BPNN uses weight connection too [23]. Therefore the optimization problem of weights between the hidden layer and the output layer of RBFNN can be solved by optimizing the parameters (weights and thresholds) of the BPNN model.
The transfer process as shown in the dotted lines of Fig. 5, the network between the hidden layer and the output layer of the RBFNN is changed into a three-layer BPNN.  Fig. 5 is randomly determined. It just helps the author explain more easily.
Where, the hidden layer and output layer of the RBFNN are circled with the first dotted lines, the BPNN is circled with the second dotted lines.
It also can know from the sub-section 'GA-BP Algorithm', GA realizes the optimization of the initial weights and thresholds of BPNN. Therefore, the optimization process of the weights of RBFNN is realized by the GA-BP algorithm.

Training and testing of the RBF algorithm
The process of RBFNN training and testing is shown in Fig. 6.   Fig. 6. The process of RBFNN training and testing Step 1. Setting parameters of RBFNN (inputs, outputs, and nodes of RBFNN).
Step 2. Normalization of the input sets.
Step 3. Running the K-means clustering algorithm (KMCA), get the centers of RBFNN, and calculate the widths of RBFNN.
Step 4. Using the GA-BP algorithm to optimize the weights and thresholds of BPNN (choice of parameters of BPNN; setting parameters of GA; choice of epochs of BPNN).
Step 5. The adjustment of the KRBF-GBP algorithm model.
Step 6. Inputting test set, test the KRBF-GBP algorithm, and calculate the accuracy. Finally, compare the accuracy of the KRBF-GBP algorithm with the unimproved RBFNN.
The following contents will introduce to the process of RBFNN training and testing in detail.

Fault types
Codes of expected output The number of nodes in the input layer, the hidden layer, and the output layer of the 3-layer forward neural network are i n , 1 h n and o n . In the design of hidden layer nodes, many authors adopt the following formula [28].

Normalization of the input sets
DGA data for fault diagnosis come from transformers with different capacity and voltage levels, and the volume of gas of the same fault type in the sample is different. For maintaining the identity of the input vectors, the components of input gases need to be normalized, mapping each input vector into an interval of 0~1 [29].

Running the K-means clustering algorithm (KMCA)
Programming implementation algorithm. Running the code of the K-means algorithm in MATLAB 2016a to get the centers of the hidden layer of RBFNN and calculate the widths of the hidden layer of RBFNN. The outputs of the hidden layer of RBFNN is calculated using the Gaussian function.

Choice of parameters of BPNN
The inputs of the BPNN in this research are the outputs of the hidden layer of the RBFNN. Tan sigmoid transfer function is selected as the transfer function of the hidden layer of the BPNN, the pure linear transfer function is selected as the transfer function of the output layer of the BPNN. The Levenberg-Marquardt algorithm is selected in the training algorithm of weights because it has the advantages of both the gradient method and the Newton method.
It can be known from sub-section 'Setting inputs, outputs, and nodes of RBFNN', the output node numbers of the hidden layer of RBFNN is 1 h n , so the input node numbers of BPNN is 1 h n , the output node numbers is 6 (T1, T2, T3, D1, D2, PD), the node numbers of hidden layer of BPNN is 2 h n . According to formula (3) and (4) n is taken, so assume 2 h n is 9.

Setting parameters of GA
At present, the selection of GA parameters is based on experience and the simulation test of specific problems [30]. The choice of GA parameters in this paper is also based on the experience of previous simulation tests. The population size of GA is set to 20, genetic algebra is set to 15, crossover probability is set to 0.6, mutation probability is set to 0.05.

Choice of epochs of BPNN
Generally speaking, the smaller the allowable error of the neural network, the higher the fitting degree (R), its prediction accuracy (The determination coefficient (R 2 ) is used in this paper) is also higher. However, the practical application shows that the prediction error decreases at the beginning with the decrease of the fitting error (The mean square error (MSE) is used in this paper), but with the decrease of the fitting error to a certain value, the prediction error increases, which indicates that the generalization ability of the network is reduced and the over-fitting occurs [31]. Under-fitting refers to the low fitting degree of the model. Usually, its performance in the training set is poor, and the performance in the test set is also poor. Adjusting the number of epochs of the BPNN can effectively prevent the occurrence of over-fitting and under-fitting.
Running the code of the GA-BP algorithm in MATLAB 2016a for computer simulation, record the variation relationship between MSE, R, and R 2 with epochs, find the best epochs of the network. It can be known from Table 5 that when the value of epochs is 35, the prediction accuracy is 94.06%, and the fitting degree is 93.06%. It shows that the network has neither over-fitting nor under-fitting, and the accuracy meets the requirements. Meanwhile, the value of the MSE is 0.0186. So the value of the epochs is set 35 in BPNN.
Running the GA-BP algorithm that sets all the parameters, and get the KRBF-GBP algorithm model.

The adjustment of the KRBF-GBP algorithm
The number of RBFNN hidden layer and the number of BPNN hidden layer, and the relationship between them have a great influence on the performance of KRBF-GBP model.
From Running each case three times, then take their average, the results as shown in Table 6.

The testing of KRBF-GBP algorithm
Test all data of the test set, the results as shown in Table  7.  Table 7, it can be seen that 15 sets data of fault T1 are all correct, so its accuracy is 100%. There are 16 sets data of fault T2 are correct, and 2 sets data is misdiagnosed to T1 and PD, so the accuracy is 88.89%. There are 30 sets data of fault T3 are correct, and 1 set data is misdiagnosed to T1, so the accuracy is 96.77%. There are 13 sets data of fault D1 are correct, and 1 set data is misdiagnosed to D2, so the accuracy is 92.86%. 20 sets test data of fault D2 are all correct, so its accuracy is 100%. 12 sets test data of fault PD are all correct, so its accuracy is 100%. Finally, it can be known the accuracy of KRBF-GBP algorithm for transformer fault is 96.4%, which is calculated by (100% + 88.89% + 96.77% + 92.86% + 100% + 100%)/ 6. However, the prediction accuracy of the power transformer fault by RBFNN without parameter optimization is only 71.43% [32]. So the KRBF-GBP algorithm improves the accuracy of power transformer fault diagnosis.

Conclusion
An optimization scheme of RBFNN parameters is proposed, and the KRBF-GBP algorithm model is established. It can be seen from the test result that the KRBF-GBP algorithm improves the accuracy of the transformer fault diagnosis. Therefore, it can be proved that the scheme of optimizing the parameters of RBFNN by the K-means and GA-BP algorithm is feasible, and the KRBF-GBP algorithm is suitable for transformer fault diagnosis and has great application prospect.

Future scope
It is undeniable that this research has limitations. Many adjustable variables are involved in the training of the neural network model, such as the nodes numbers of the hidden layer of RBFNN, the iteration numbers of BPNN, the population size of GA, and so on. There are no great methods about how to choose them, and the most widely used is by trial and error method. In this research, many adjustable variables are selected by the hypothesis firstly and then solved by trial and error method. So theoretically, the RBFNN trained by K-means clustering and GA-BP algorithm is difficult to achieve the best training effect, and there is still a better model than it. Hope the problem can be solved in future research works.
Facilities and support provided by SEGi University is highly acknowledged.