Research on Recognition of Electricity Fittings in High Voltage Transmission Lines

On the basis of analyzing the structure of common power fittings in high-voltage transmission lines and their image features, combined with the DNN deep neural network in machine learning, we proposed a model suitable for high-voltage transmission line inspection robots to identify the types of electric power fittings on the transmission lines. And design a fast ROI generation method suitable for recognizing fittings on power transmission lines. Then we verify the feasibility and rationality of the fitting identification model.


Introduction
High-voltage transmission line inspection robots usually move back and forth on the wires through a mechanical structure. With machine vision and fault diagnosis technologies, it can inspect the transmission lines independently. In the process of inspection, it is bound to encounter many electric fittings and need to get through them. For different types of fittings, the robot needs to adopt different leaping motion strategies. Therefore, the research on the identification technology of transmission line fittings has important engineering application value.
At present, the recognition method of the transmission line fittings through the camera and the machine learning model has greatly improved the accuracy of the identification of the complex fittings compared with the previous recognition methods based on geometric constraints or multi-sensor perception [1]. Many scholars have carried out related studies. Zhan C [2] proposed an obstacle recognition method based on Adaboost algorithm, which improved the credibility of classification by training strong classifiers, and realized the classification of multi-category obstacles by combining the strong and weak classifiers. Li Y [3] used the feature based on Histogram Of Gradient (HOG), combined with SVM classifier to recognize and locate insulators. Liu H [4] proposed a two-step classification model. Firstly, Harr-like features combined with Adaboost cascade classifier were used for preclassification, and then HOG features combined with SVM method was used for secondary classification to realize the recognition of transmission line fittings. Zhang F [5] et al. complete the recognition of transmission line fittings through the monocular vision recognition method and the fusion of multiple sensors.
Earlier researchers often used the binary classifier, that is, only distinguish the background and the target.
Qi Y [6] proposed an improved SSD model based detection method of line fittings in aerial inspection images of transmission lines, which solved the problem of a large number of missed detection caused by the too small proportion of fittings in the image. Li H [7] proposed a model based on optimized ResNet. He completed the recognition of line fittings training the data set collected by the UAV. Fan H [8] et al. verified the effect of Faster R-CNN model and YOLOV3 model in the corrosion detection of electric power equipment through data and experiments. He concluded that YOLOV3 model was better than Faster R-CNN in realtime. Among the latest recognition methods, neural network is relatively better. Compared with Conventional Neural Networks, Faster R-CNN and YOLO network are faster in recognition. However, these two models usually need a large amount of data to train the model so as to achieve better results. Thus, according to the specific needs of a certain patrol robot, we decide to realize the recognition of fittings by the features of HOG with trained DNN networks which can recognize the fittings faster without extremely large amount of image data.

Features of fittings
There are many kinds of power fittings on the highvoltage transmission line. The common ones include insulator chains, strain clamps, dampers, suspension clamps, spacers, etc., as shown in Fig. 1. The structure of fittings is obviously different. Most of the electric power fittings are silver white, and a few fittings are non-metallic materials. Such as insulator chains, synthetic insulator is red rubber, porcelain insulator is milky ceramic.
The environment of fittings recogintion is transmission lines, and the background is usually the sky and part of the natural environment (mountains, rivers, etc.). The sky usually takes up more than half of the image ( Figure 2). The robot hangs on transmission wires with both arms to inspect. By placing the camera in the front of the patrol robot and adopting the elevation angle of 45°, the sky will take a higher proportion in the images taken, which is beneficial to the dectection and recognition of the power fittings to a certain extent.

Technical route
Image recognition is to input an image from the outside to the model of image recognition, and then segment the image through a preset algorithm to show the position of the object to be recoginize in the image. After image preprocessing, the image containing the target to be recognized is transferred to the classifier, and the classifier will give a pridiction of the target.  Preprocessed the collected imagesinto a sample set and imported into the designed classifier for training to obtain a classifier model. The ROI(Region of Interst) generation algorithm is adopted on the input image to achieve a complete model from image input to classification.

Image feature operator
HOG(Histogram Of Gradient) is a kind of image feature operator commonly used in computer vision. HOG feature combined with SVM classifier has been widely used in pedestrian recognition. The method of pedestrian detection using HOG combined with SVM was proposed by French researcher Dalal in CVPR in 2005 [9]. Because pedestrians are usually represented as an Arabic digit 1 in images, it is easier to find pedestrians in images by looking for this specific gradient feature.
By integrating the shapes of several common fittings, it can be found that most of the fittings usually have obvious edge contour, especially the straight line and corner contour. Therefore, the HOG feature operator will be more appropriate. For various straight lines contour in power fittings, the detection and calculation of linear gradients from multiple angles can be realized by grouping HOG feature bins into multiple groups. Such multiple gradients can be expressed in a specific combination to represent specific features of a certain fittings.

Machine learning model
Common machine learning models include traditional SVM (Support Vector Machine) and random forests. In recent years, CNN convolutional neural network has an excellent performance in generating adaptive features to recognize complex targets, and is the most mainstream image recognition method.
CNN recognition requires a lot of computing power. In the recognition of transmission line fittings, the efficiency of real-time recognition is weaker than that of traditional image feature combined with SVM. Convolutional neural network requires a large number of image data to achieve efficient model training, which is usually trained by transfer learning to reduce the occurrence of model overfitting.
On high voltage transmission lines, the fittings are often of the same specification so we do not need to recognize a widely range of different targets. The number of photos of specific fittings on a certain section of the transmission line is generally small. Therefore, the method of HOG feature operator combined with DNN deep neural network can be used to establish the image recognition model for power fittings recognition.
Compared with traditional SVM, DNN [10] directly achieves multiple classification effects through Softmax layer and does not need cascading of multiple SVM classifiers. In DNN network We can also solve the problem of overfitting by dropping nodes randomly by dropout algorithm. Compared with hyperplane in SVM, weight in neural network can more accurately describe the relationship between multiple target categories.

ROI generation solution
In the process of image recognition, there are object classification and target detection. After inputting a whole picture to the recognition model, the classification model can only be used correctly if the position of the object is marked in the image.
The traditional ROI generation method is sliding window, which sets one or more windows of fixed size on the input image, slides along the horizontal and vertical coordinates of the image, reads the pixel information inside the sliding window one by one then transfer them to the classifier for classification. Because of the huge amount of data, this search method has low recognition efficiency. However, the method based on region proposal can improve the situation, among which selective search is the most representative one. The image is divided into many small areas by a simple region partitioning algorithm, and then the number of candidate boxes generated is reduced by aggregating the adjacent small areas continuously by the similarity degree and region size.
Considering the number of samples sets and the actual engineering background, the selective search method is more suitable for the recognition of high voltage transmission line fittings.
The classifier is built by DNN network and HOG feature operator and trained by combining with sample set. The ROI generated from input image is imported into the classifier by selective search method for classification, and the recognition model of high voltage transmission line fittings is constituted.

HOG features extract
Taking a vibration damper image of a certain type (400×100) as an example, the HOG features will be extracted after grayscale. The slide box of the HOG feature Block is selected as 20×20 (the image pixel is preferably an integer multiple of the size of the slide window). The Cell size is 5×5, and one Cell can represent a gradient vector. The bin of HOG feature is selected as 10, so 360° is divided into 10 groups. According to the above selected parameters, a HOG feature vector with a size of 4000 is obtained in the sample, which can represent the features of the detected target and the data amount is also within a reasonable range. HOG features represent the target contour by a combination of multi-dimensional vectors and is insensitive to scale changes and light. HOG features in all samples are stored in the form of matrix, which can be imported into the neural network for training. When the amount of HOG feature data is too large, in order to improve the calculation efficiency, the matrix of HOG feature can be trained with Principal Component Analysis method for dimensionality reduction. HOG feature of each sample is expanded into one dimension, and the total feature matrix is formed according to columns, which is saved as binary file in pickle format. Binary files can be directly imported into DNN network for training after loading.

Structure of DNN network
The DNN neural network built is shown in Fig.5, which has a total of 4 layers, including 3 parameter layers. Specific parameters of parameter layer are shown in Table 1.  The number of neurons in the full connection layer is the number of network weights. When the number of neurons is larger, the space occupied by the model and the computational resources during training are larger. Referring to the parameters of Lenet-5 [11] and the size of the input parameters, after several experiments, we found that when the size of the input parameter is 4000, the number of 512 neurons can achieve a high recognition accuracy. The number of each batch of training samples is set as 30, the iteration is 80 times, the learning rate is 0.01, and the weight updating speed of the network is reasonable.
In the machine learning model, the training model is prone to over-fitting if there are too many parameters and insufficient training samples. The problem of overfitting can be solved by dropout algorithm [12]. The dropout network generated by the dropout mechanism may not be the same each time, and the weight update is no longer dependent on the joint action of the hidden nodes with fixed relations, so the network needs to learn more robust features.
In this paper, the dropout coefficient of 0.25, 0.25 and 0.5 was adopted in the three parameter layers of the DNN network respectively. When the neural network propagated forward at each layer, the corresponding proportion of hidden layer neurons in the network were randomly deleted. The input features propagate forward through the network generated by the dropout mechanism, and the resulting loss results propagate back through the dropout network. This process is performed for each small batch of training samples, and the weights are updated in the network composed of neurons that have not been deleted, so as to enhance the robustness of the neural network.

ROI generation
Selective Search method [13] adopted in this experiment realized the merger of small areas through clustering and combined them into large areas to obtain multiple candidate boxes. By setting boundary conditions, the length-width ratio is limited to 4:1 or less, the unreasonable selection boxes will be removed in advance, so that the remaining number of boxes is between 10 and 30, which reduces the workload of subsequent operation. Selective search method is applied to obtain a large number of boxes, and the image in the box is scaled and imported into the classification model, which can reduce the computation of target detection and recognition. For the recognition of high-voltage transmission line fittings, we optimized the ROI generation of power fittings in this paper. Zoning and filtering the input images can reduce the amount of computation generated by ROI. Taking an actual input image of transmission line fittings as an example, the background is relatively monotonous, and the lines and pylons only occupy a part of the image. The input images were divided into 3×3 grids (the number of partitions was adjusted according to actual needs), the pixel gray scale calculation was carried out on each grid image, and the regions within each grid were scored according to the results. A threshold value was set to exclude the grid areas whose scores were lower than the threshold value. Therefore, the selective search method only needs to perform ROI calculation on the remaining areas within the grid, which can greatly save the time. As shown in Figure 7, some regions that do not contain objects to be detected can be excluded by judging regional similarity, thus reducing the operation time during ROI generation. When the background is more complex, the threshold value can be adjusted appropriately to prevent missing detection of the fittings. After the generation of ROI region, several ROI regions would overlap (Fig. 6). Therefore, Non-Maximum Suppression [14] could be used for optimization. The NMS method sort the generated multiple candidate boxes, according to the evaluation of classifier confidence ranking. Then takes the candidate box of the highest confidence level to calculate the IOU (overlap area ratio) with the remaining candidate box. Those boxes whose IOU is higher than the pre-set threshold will be eliminated. Finally retains the candidate box of the highest confidence. Repeat the above steps for the remaining candidate boxes until you find all candidate boxes that should be kept.

Make sample set
The source of the experimental data set is divided into two parts. One part is the pictures directly taken in the laboratory environment, which are the majority of the pictures. The other part is images collected from Internet.
The pictures collected in the laboratory environment are mainly characterized by monotonous background, similar illumination conditions. Pictures collected from the Internet are characterized by changing backgrounds, different shooting angles and sizes of the fittings. And the sizes of the target pixels are different, which need to be scaled.
There are 310 pictures of transmission line fittings collected in the laboratory, including 91 pictures of insulator chains, 129 pictures of dampers and 90 pictures of strain clamps. A total of 270 pictures of fittings were collected from the Internet, including 89 pictures of dampers, 101 pictures of strain clamps and 80 pictures of insulator chains. In this experiment, data augmentation was used to increase the number of samples. By flipping the original sample images up and down, left and right, the original data set was expanded to three times the original size, namely 1740 pieces.
The two parts of images were pre-processed and mixed into one sample set, from which 30% was randomly stratified and taken as the verification set and the rest as the training set. For the background of each sample, namely the part of the image that is not the target to be recognized, 200 pieces are cut out as negative samples and added into the training set to enhance the generalization ability of the model. Considering the specific working environment of the robot and image pre-processing, the method of adding noise randomly and modifying contrast is not adopted in data augmentation. By expanding the samples, the situation of model overfitting caused by too few samples can be alleviated. The sample pictures collected by the laboratory are relatively uniform in size, and several types of fittings collected are roughly in the shape of long strips. Through batch processing, the laboratory sample pictures were selected and clipped in box one by one, and the size of the pictures was uniformly scaled to 400×100. The pictures of the same type were stored in the same folder. After making the normalized sample data into a binary file in pickle format by using the pickle library in Python, category annotation can be added at the end of each image information in the sample data matrix.
The experimental evaluation indexes are mainly accuracy and recognition speed. The experimental group was the HOG feature +DNN model. DNN model and CNN model were set for the control, and the same training set was used for model training. After training, all models were validated using the same test set.

Experimental results
We used AMD Ryzen 4800U CPU 2.9GHz, NVIDIA RTX2060 as the environment to conduct the experiment. Spyder 4.0, Python 3.7.2 and OpenCV3.3 were taken as software environment.
We did 10 repetitive experiments on each model and the record the indexes of each model in the following table (Table 2).

Analysis of results
The confusion matrix was used to describe the experimental results of HOG+DNN model, and the results were shown in Figure 9. It can be seen from the table that category 0 (strain clamp) and category 2 (damper) have better classification result and can correctly classify all the pictures in the validation set. In category 1 (insulator chain), some of them are wrongly classified into category 2. After several random sample grouping experiments, combined with sample image analysis, the classification of insulator chains may be influenced by the background colour and the two types of insulator pieces in the sample library. Most of the insulator chains in the image sample library are of synthetic materials, and very few of them are of ceramic type. There is a big difference in the appearance characteristics between the synthetic insulator chain and the other two kinds of metal fittings. And the recognition result is good. The few samples of the ceramic insulator chain affect the weight of the insulator chain recognition. After the ceramic insulator chain images are removed from the sample library, the result of insulator chain recognition can be further improved. Based on the fittings image data set in this experiment, compared with the CNN network which adaptively extracts features and the DNN network which directly performs feature operation on pixel units, the combination of the HOG feature in image features and the DNN network has a higher recognition accuracy, and the time consumption is within the acceptable range, which can meet the practical engineering requirements.

Conclusions
Aiming at the recognition of line fittings by the highvoltage transmission line patrolling robot, the HOG feature operator is used to study the recognition model combined with DNN. Meanwhile, the effect of the pure DNN model and the CNN network model on the recognition of power fittings in the same environment is compared, and the following conclusions are drawn: (i). The experimental verification shows that the HOG feature combined with DNN recognition model is feasible in the recognition of transmission line fittings and can meet the needs of engineering.
(ii). The recognition model of HOG feature combined with DNN consumes less time and has high accuracy in the recognition of transmission line fittings, and can handle multiple classification and recognition tasks.
(iii). In the recognition of power fittings with small training samples, the HOG feature combined with DNN network model can meet the engineering requirements better.
In view of the partial misclassification in insulator chains recognition, the subsequent work should take the different types and materials of the same kind of power fittings into account. Then design a model that can recognize and classify the same fittings with different shape and material characteristics correctly, so as to make the recognition model more practical for engineering.