Classification of diagnostic features of transient signals in the electric power industry

. Problems of practical implementation of traveling wave fault location caused by the registration of signals of different nature are considered. Analysis of the experimentally recorded traveling waves made it possible to divide them into 4 groups caused by partial discharges, lightning overvoltages, scheduled switching and fault commutations. The network dispatchers only needs the fault commutation information. Traveling waves recorded near the place of their origin have different meanings of diagnostic signs. The magnitude of the pre-alarm noise, the number of pulses in the signal and the duration of the signal are used as diagnostic indicators. These three diagnostic signs allow one to recognize each of the 4 causes of the travelling waves.


Introduction
The intensive practical development of wave methods for determining the location of damage in the last decade is due to advances in electronic technology and communications. Despite the results of a large number [1][2][3][4] of theoretical and experimental studies of travelling wave methods of fault location (TWFL), its large-scale implementation is complicated by many problems. They are manifested both in the difficulty of achieving the potential accuracy of TWFL, and with problems related to the identification of transient signal (TS) with their causes. This implies a visual assessment of the TS and makes it difficult to automate the TWFL. If the first problem is closely related to the influence of the dispersion mechanism, which aggravates the situation especially in heterogeneous cable and air lines, the second problem complicates its implementation, when information about non-emergency TS distracts grid dispatchers from operational work.

Problems of introducing wave determination of the fault location
Registration of TS with a high sampling rate of up to 10 MHz delivers a large amount of data. Their intelligent processing allows you to determine not only the location of the fault, but also the cause of the TS. These include partial discharges, short circuits, and scheduled switching. The development of intelligent data processing methods will allow one to quickly deliver to the dispatcher only information about emergency events, and, after appropriate processing, reports on the state of isolation in different parts of the network. Work in this direction is still in its infancy. This is due to the complexity of developing both hardware and algorithmic support for TWFL. The development of these works is an urgent direction, as it allows us to significantly improve the quality of information about the state of the most important object of the electric power industrycable and overhead lines.
The TS is a time-limited radio signal or a highfrequency bearer. It is superimposed with amplitude, phase, and frequency modulations. All of these modulation types provide useful information about the cause and source of the signal generation and the parameters of the discontinuity along the signal propagation path. Finding the relationship between the parameters of the network in which the TS is distributed and the parameters of the TS itself is a difficult task.
This relationship should be found by modeling, for example, in the PSCAD software package. This is the content of the first approach. The second approach to finding this connection is based on the analysis of experimental TS, supplemented with information from the services operating the network about the causes of their occurrence, meteorological and thunderstorm information. The TWFL uses only one, the most important parameter of the TS -the global scale time of its registration at a specific point in the network. To answer the question about the reasons for generating the TS, you must use the other TS parameters. Intelligent analysis of the waveforms of the TS involves finding a connection between its structure and information about the location and causes of the generating TS.
An example of the first approach is [5]. Based on modeling in the PSCAD software package, it is shown that the TS that came to the observation point consists of a set of single video pulses due to multiple re-reflections from the ends of the line and places of heterogeneity. The time intervals between pulses contain information about the location of the TS. Proposed to determine the place of its occurrence by finding the maximum correlation coefficient between the recorded pulse sequence and the reference sequences for all possible points of occurrence of the TS. Unfortunately, in experimental measurements, probably due to a much larger number of inhomogeneities in the real line, for example, in the form of a traverse of power transmission poles, it is impossible to distinguish elementary pulsesthey all merge with each other.
The disadvantage of the approach to signal feature detection described above is that it is based on understanding the nature of the signal occurrence, which is not always available to the researcher. The problem of signal classification based on the analysis of their formal features expressed in terms of basic functions is considered in [6]. This approach is attractive because the obtained features are not related to the physical model of the signal, i.e. it is universal. The criteria for attribute quality are strongly related to the methods of filtering features, which are already known to a large number at the moment. An overview of feature filtering methods can be found in [7].
Many publications are devoted to the classification of electrocardiogram signals. It is necessary to note a close analogy in the mechanism of TS formation in power lines and the processes of pressure changes in the arteries of a living organism caused by the opening (switching) of the heart valve. A significant difference between both signals is shown in the fact that the TS is a radio signal, and the QRs signal of the electrocardiogram is a video signal, due to the absence of reflections from the end inhomogeneities in the blood arteries, which smoothly pass into the capillaries. Reference [7] provides an overview of various numerical methods for processing electrocardiograms for detecting the QRS signal.
TS generated by emergency switching have a large amplitude and are easily detected, but as they propagate, their amplitude decreases significantly, and the structure is greatly distorted. Of course, this makes it difficult to apply mathematical methods for detecting TS [8] and forces us to use algorithms that take into account the physics of their occurrence [9].
Neural networks allow one to work with unstructured information, and they themselves extract the necessary features for classification, saving the researcher from the need to determine the features [10,11]. The disadvantage of neural networks is their training procedure, which makes it difficult to use them in the analysis of TS.
The proposed work looks for the connection of the cause of the TS with its parameters based on the second approach, based on visualization of experimental data, and the development of a heuristic algorithm.The aim of feature selection is to determine a feature subset as small as possible. It is the essential preprocessing step prior to applying data mining tasks. It selects the subset of original features, without any loss of useful information. It removes irrelevant and redundant features for reducing data dimensionality.

Basic types of transient signals
In April 2019, on the cable line 110 kV 11.058 km long in Kazan electrical grids, a software and hardware complex of traveling wave fault location was installed as part of sensor No. 23 installed at substations (SS) Vostochnaya, and No. 29 installed at the Centralnaya Substation ( Fig. 1). Both sensors register transient signals generated by commutations of any nature using split current transformers connected to the secondary circuits of standard measuring current transformers of two phases of the cable line "A" and "C" [4]. Reference [4] also consideres the algorithm of operation with a larger number of sensors.
During the operation, seven synchronous events were recorded simultaneously at both substations and many of single events recorded only on each side of the line. Acquiring waveforms with a short sampling interval of 1.075 microseconds on a global time scale allows you to determine the location of the TS.

Algorithms for determining the essential features of the transient signals
Consider the sequence of determining the significant parameters. The oscillogram registered in global time scale and transmitted to the upper processing level consists of 5652 samples, which corresponds to total duration of more than 6ms. The oscillogram is registered by blocks, which allows analyzing the pre-accident part of the TS waveform. It contains important information about the value of the constant signal level and the spread of the amplitudes of instantaneous samples relative to the value of the constant signal level. The beginning of the transient signal occurs in the interval of sample numbers from 1500 to 2500. For the analysis of pre-accident information, we selected samples from 1 to 1000 with a total duration of more than 1 ms. At this interval, the average of the instantaneous amplitudes of the samples (AVG) is calculated. Then all the instantaneous amplitudes of the waveform are shifted by the above calculated value.
The To calculate the half-period of free oscillations (FO) of the TS, a series Ti is formed, each term of which is the result (-1 or 1) of comparison with the zero of the instantaneous amplitude of each sample of TS displaced amplitude. From the Ti series, the Di = Ti-T(i-1) series is formed. From the series Di, a series of the current sum of the half-period samples is formed. The average value of the value of the half-period of the first TS pulse is calculated as the arithmetic mean of the nonzero values of the Di series in the time interval from Tn1 to Tk1. During scheduled switching, the TS consists of several pulses each with a duration of more than 100 microseconds, due to the contact bounce of the high-voltage switch. These pulses, when propagated over the line, due to the dispersion mechanism, merge and are recorded as a single pulse with a free oscillation frequency of less than 10 kHz. (Fig.2).

Algorithms for determining the causes of the transient signals
In thunderstorm overvoltages, the TS, due to the mechanism of streamer formation, is preceded by increasing noise, and its main body is created by the main lightning discharge. Only the main body of the TS is registered (Fig.3), when this TS propagated along the line on the big distance.
TS caused by partial discharges (PD) is characterized by a short duration (less than 50 microseconds) and a high frequency of free oscillations (more than 100 kHz) (Fig.4).

Statistics of synchronous and single events
Three years of operation of the software and hardware complex for wave determination of the fault location in Naberezhno-Chelny RES (Fig. 9) made it possible to obtain a large volume of oscillograms of signals of transient processes. Many synchronous operations of the sensors of the complex were recorded (Fig. 10, 11, 12), which make it possible to determine the place of occurrence of the transient signals. The distribution by months (Fig. 10) indicates many of registered synchronous cases in the spring-summer-autumn seasons, caused by an increase in electrical discharge activity (Fig. 4) in the high-voltage line insulation due to high humidity and induced by lightning overvoltages (Fig. 3). These events are grouped by specific weatherrelated days. Events are grouped to specific areas of the power line that are characterized by weakened highvoltage insulation or induced lightning surges. Most of these events have no effect on line performance. Some events lead to emergency damages resulting in line shutdowns by relay protection. Such an event occurred on April 10, 2018 (Fig. 13). Six shortterm single-phase earth faults (SPEF) ended with phaseto-phase short circuit occurred within 55 minutes. All of them were registered by the sensors of the complex. Further, two events of supplying voltage to the line (scheduled switching) in JAKNO5 were recorded when searching for a fault location. Stable values of different time delays (Fig. 13 and Table 1) of emergency and scheduled commutations indicate different places of TSs occurrence.

Analytics of diagnostic features
The main task of the complex is to promptly inform the grid dispatcher about the location of the fault event. An auxiliary task of the complex is the preparation of the final reporting on the dynamics of the place of origin and the intensity of electric discharge activity. To solve both problems, it is necessary to solve the problem of classifying the type of registered TSs according to the features in Fig. 2-7. The classification problem is solved based on the analysis of the set of diagnostic features [12]. Diagnostic features include the parameters of the TS, such as the start time, maximum amplitude, number of pulses in the signal, the maximum amplitude of noise, etc. To study the information content of various diagnostic features, a program was compiled that calculates these features in the recorded TSs. The calculated diagnostic features form a database that can be visually analyzed by standard analytical methodscorrelation dependences of pairs of features. The diagnostic signs of all TSs (single and synchronous) were processed ( Table 2). The increased number of PDs on SS with a voltage of 110 kV is explained by the large number of PDs on the higher voltage line. The increased number of PDs caused by SS-730 and SS-777 indicates the existence of high voltage insulation problems on these SSs. The number of registrations of TSs are varied throughout the year (Fig. 10). The largest number of singly recorded events are generated by PDs of different intensities. They are manifested in different forms of TSs: single short durations, a sequence of pulse signals or as continuous noise sequence. Three algorithms are used to determine the beginning of any TS. Differential algorithm "SensorFlag" is based on exceeding the programmable threshold by the difference of two instantaneous signal amplitudes shifted by 100 µs in the interval of three consecutive samples. This algorithm is implemented in the sensors of the complex and is used to store preemergency instantaneous amplitudes in the interval of the first 1000 μs of the oscillogram. The "FLAG" algorithm implemented on the server of the complex searches for the time position of the first instantaneous amplitude, earlier than the time of the maximum signal value, which exceeded the threshold equal to three standard deviations of the instantaneous amplitudes of the first 1000 μs of the oscillogram. The "mcf2StartPointA" algorithm implemented on the server of the complex searches for the time position of the first instantaneous amplitude that has exceeded the threshold equal to the maximum instantaneous amplitude for the first 1000 μs of the oscillogram. Figure 14 shows the correlation dependence of the FLAG and SensorFlag features. Events located below the linear dependence correspond to PD with duration less than 3 μs, not recorded by the "SensorFlag" algorithm. Events located above the linear dependence correspond to a sequence of PD pulses with increasing amplitude. In Fig. 15, the events located below the linear dependence also correspond to PD with a duration of less than 3 μs, not recorded by the "SensorFlag" algorithm. Events located above the linear dependence correspond to a sequence of PD pulses of large amplitude with a duration of less than 3 μs recorded in the interval of the first 1000 μs of the oscillogram.  The highlighted diagnostic signs carry important information about the parameters of the TS, the reason for their occurrence and contain information about the state of high-voltage insulation at different points of the network. Let's analyze our database for 11.16.2020 of the diagnostic features of the TSs obtained on the section of the 10 kV hybrid overhead cable network of Naberezhno-Chelninskie electric grid shown in Fig. 9. The onset of electrical discharge activity was recorded in three phases of SS-777 (Fig. 16). After 8 ms, the increased intensity of the electric discharge activity was synchronously recorded on SS-651 and SS-730 (Fig.  17,18). The values of the duration of the TS and the duration of its leading edge make it possible to determine the remoteness of the origin of the TS from the place of its registration.
According to the oscillograms of the TS on the SS-777 buses, the PD intensity in phase B is significantly higher. Phase B signal sequence started early by about 800μs. The period of free oscillations of the TS at all phases varies in the range from 3 to 6 μs, which indicates the close location of the PD source.   As can be seen from Fig. 18, the beginning of the TS in SS-651 was recorded 2 μs later than in SS-730, which is consistent with different lengths of the hybrid line from the tap-off feeder node at SS-777 (Fig. 9). A characteristic feature is the change in the shape of the TS with an increase in the distance traveled. The duration from the beginning of the TS to its maximum amplitude and the period of free oscillations increase. Diagnostic features in Fig. 14 and Fig. 15 allow to select the sequence pulses from which the above synchronous events consist. This allows the event to be classified as PD.

Conclusions
The practical implementation of traveling wave fault location forces us to look for a connection between the cause of the transient signal and its features. Based on the analysis of experimentally recorded waveforms, numerical values of the transient signal features are derived. They made it possible to determine the cause of their occurrence: partial discharges, lightning overvoltages, scheduled switching and fault commutations. The studied essential features of transient signals have a clear physical meaning and are easily determined, which eliminates the need to use more complex methods of artificial intelligence. The result diagrams visually illustrate the desired relationship and are easily algorithmized programmatically. The results obtained can be used in the operation of fault localization complexes. The results are initial. Subsequent studies will refine the reliability of the transient signal classification.