The use of the empirical mode decomposition method to clean and restavration acoustic emission signal

. The results of the study of the possibility of using the empirical mode decomposition method for cleaning geoacoustic emission signals from various types of noise are presented. It is shown that the application of the method allows to increase the ratio of the signal noise 3-6 dB depending on the ratio of signal dispersion and noise in the input signal. The examples demonstrate the ability to remove trends and harmonic interference, as well as the ability to highlight a useful signal when masking its powerful noise. A comparative evaluation of the method in relation to the low-frequency filtration is carried out. The limitation of the method applicability in the case of processing of pulse signals asymmetric with respect to its average value is indicated.

The issues of search for stable markers of seismic events based on seismic signal analysis, in spite of its quite a long investigation history, are constantly under the researchers focus. In this field of knowledge, the most significant models for earthquakes were constructed [1,2]. Mainly, the acoustic oscillation extremely low frequency range of less than 100 Hz was considered. The papers on the investigation of higher range oscillation signals (up to 10-16 kHz), which are observed in the Earth's near-surface layer and are associated with plastic deformation physical processes, called the attention. Such acoustic signals are known as geoacoustic emission (GAE) signals [3,4]. Owing to physical laws determining the acoustic wave attenuation rate in «soft» mediums, GAE do not propagate at large distances and are from hundreds of kilometers to several hundreds of meters. It seems that this fact should limit significantly the possibility of applying them to obtain useful information. Moreover, the GAE statistical analysis applying the know criteria allowing us to check a statistical hypothesis on VR stationarity, the AFD-test by Dickey-Fuller, KPSS, shows that the hypothesis on the stationarity is not confirmed in cases of short samplings (units of seconds) not long samplings (from tens of minutes to several days) of such signals. However, the optimism of obtaining significant results of analytical study of GAE characteristic change behavior and their relation with seismic events is explained by two reasons. Firstly, it is explained by a new treatment of physical-mathematical models explaining the indirect effect of helio-and geophysical fields causing global and local motions of water and solid earth masses in the form of s and p waves causing, in their turn, deformation processes in the Earth's mantle upper layers that generates GAE [5]. Secondly, investigation of information technologies gave rise to the implementation of empirical methods adapted for the processing of nonstationary signals which show better results in practice compared to classical methods of detection of important information [6,7,8] and, consequently, more stable indexes for earthquake prediction can be obtained based on them. Analysis of the observations carried out by IKIR FEB RAS specialists showed that GAE pulse occurrences are associated with deformation process dynamics in the Earth's nearsurface layer. In the conditions of absence of clear seismic events at a definite local observation area, we can state an index, background activity relative frequency, for GAE pulse occurrences over a fixed time interval associated with relaxation ground processes under the effect of weather factors such as atmospheric pressure, soil humidity change, precipitation, near-ground air temperature changes. Significant increase of this index shows high probability of a seismic event occurrence. Just for this reason, GAE is of great interest in the investigation of plastic processes associated with landscape stability and earthquake precursor generation. It was shown that based on the amplitude-frequency content of a GAE signal, it is possible to determine the scale of a source generating the signal and to estimate the distance from it to the registration site as well as to determine the direction of signal arrival [6,8].
Owing to its physical origin, GAE is a chaotic sequence of pulses of different amplitude, inner frequency and the possibility of their occurrence during an observation epoch. Thus, GAE refer to non-stationary signal class, i.e. such that the statistical characteristics such as mathematical expectation, dispersion and correlation characteristics significantly change in time. In practice, the observed GAE is recorded at the background of noise generated by weather factors and other atmospheric phenomena which also have non-stationary character. In the result, the process of useful signal detection becomes a challenging task as long as the traditional methods of statistical processing and analysis of stationary signals such as correlation and spectral analysis are capable of providing stable detection of GAE pulses only when the signal to noise ratios are significantly higher than theoretical values based on the estimations of statistical moments of the first and the second orders for stationary signals. We demonstrate it by the following example (Fig. 1).

3008
The figure shows a fragment of a recorded acoustic signal where the callout (a) shows a background noise part, and the callouts (b) and (c) are the pulses with large and small amplitudes of GAE. We can see that GAE amplitude spread is quite wide and during weak pulse signals it is masked by the background noise.
Fundamental investigations have been carried out in the works of the researchers from the Laboratory of Acoustic Researches of IKIR since 1998. They aim at investigating the GAE characteristics [6,8]. In the course of measurements, we face a number of intractable problems of detection and recognition of useful signals of acoustic emission pulse activity at the background of anomalous noises localized in time and space, significant distortion of GAE frequency-time structure and spatial weakening of the signals energy as well as detected diversity of morphological forms preventing optimization of consistent units of the receiving instrumentation.
However, the problems associated with background noise effect of instrumentation origin and the noises generated by external natural and artificial factors complicate significantly the efficiency of GAE recording and analysis. In connection with that, we consider a method which proved itself in the practice of non-stationary signal preprocessing and which remained unnoticed up to the present. We mean the Empirical Mode Decomposition Method (EMD) suggested by N. Huang (N.Huang 1998) [9]. Applications of EMD method to denoise GAE are the subject of consideration of the investigation results.
Development of the method proposed by N. Huang for the processing of noised data was named and recognized as «Huang-Hilbert transform (HHT). This empirical method is based on numerical methods of calculation of time series (TS) envelope followed by iterative calculation of a new time series envelope in the function of which the TS envelope, obtained during the previous calculation stage, is used. A set of the obtained TS envelopes is taken as eigen non-orthogonal basis in which the statement on the initial TS stationarity is not used. Such an approach to basic function formation provides the possibility not to involve a priori information on the TS under analysis and removes the necessity to select inner parameters of an analysis method. In this investigation we apply only one algorithm from a pair of HHT complete algorithm elements, in particular, the Empirical Mode Decomposition (AMD) Algorithm. Time series model is represented in the form of adaptive composition of characteristic components ci(t): (1) (or modes as they are called in foreign literature) and a reminder series rn(t). Each of these modes is some basic function and it carries information corresponding to one or another real physical process involved in the formation of a TS under analysis. The set of basic components, which, generally speaking, may be non-orthogonal to each other, is not fixed but is adaptive and depends only on the initial data kind.
The unquestionable advantage of the chosen approach is the fact that the basis of characteristic components, constituting a TS, is determined by the TS itself but an algorithm type and the choice of its parameters. The obtained components are not orthogonal functions and are redundant, in a certain way. In this case, as it was shown in [10], the set of obtained components characterizes uniquely the analyzed signal fragment and can be considered as an optimal balance in this sense.
We denote briefly the main stages of EMD algorithm according to the author [9]. EMD divides uniquely any time series u(t) of length N into the components which are called Intrinsic Mode Functions (IMF). This procedure is achieved by the following sequence of actions: 1. All local points of an extremum for a given u(t) are preliminary found.
4. Obtain an excess by subtraction (3) 5. Check the necessity of completing the rejection cycle according to the criterion (the essence of the criterion is discussed below) and when it is not fulfilled, application of the procedures 1-4 for the excess of TS u(t) = r . (4) 6. The excess r obtained after the rejection we take as the obtained component ci(t) = r, where i is the component number of the number of EMD cycles.
7. Extract the obtained components ci(t) from the initial TS. The obtained excess is a new TS for the following decomposition.
8. Check if the new TS is a monotone function or satisfies certain desired conditions. If the function is monotonous, EMD decomposition stops. Otherwise, the obtained TS is used as the initial one and the whole decomposition algorithm is repeated for it from the very beginning. The TS obtained at the algorithm last stage is the residual component rn(t) from expression (1). Owing to the fact that excess rejection procedure should be finite, Huang proposed to apply a stoppage criterion for that. Different scientists suggested their variants of stoppage criteria, however, in practice, in many practical cases, less than 10 iterations turn to be enough to obtain the components with the properties close to characteristic mode IMF functions [10].
The described EMD algorithm was applied to check its applicability as GAE denoising procedure. A computer program was written and tested for that purpose. The peculiarity of application was the processing of large volume TS (from 500000 samples and more) to create a detector for GAE pulses of different amplitude including the pulses with amplitude values of background noise order. In this case, the noise and the signal had nonstationary character that limited application of traditional methods for data pre-processing.
In order to investigate the EMD applicability of denoising GAE signals, a computer program was developed which realized the algorithm. Fig. 2 shows the results of application of this algorithm to process an artificial signal containing four sinusoids of 125, 73, 37 and 14 Hz aliquant with respect to frequency for the sampling frequency of 4096 Hz. Clear division of the mixture into the components and an excess close to the isoline and to zero oscillations are observed. A boundary effect of spline interpolation is observed at the beginning and at the end of the processing. The decomposition may be interpreted as an auto-matched filtering which is carried out without any imposed conditions.
In order to check the expediency of EMD application for GAE processing, we performed a number of numerical experiments in which an additive signal model was used in the form of a mixture of artificial pulses similar to GAE pulses and a Gaussian noise generator. GAE pulses were simulated by Berlage function generation on a short time interval. Pulse parameters were selected based on the previous research works at the Laboratory of Acoustic. Research of IKIR FEB RAS [6] and identified as an optimal Berlage-Gabor basis for GAE pulse description [12]. The Berlage function (for brevity, Berlage pulse) is more preferable in the experiments since it has not only a form of a curve in time (pattern) close to typical GAE pulse patterns but it is also asymmetric, has a short leading edge and a stretched ending that is convenient for identification in the course of visual analysis. The analytical Berlage function is defined by formula (5).
where A is the amplitude chosen so that the pulse is normalized ||g(t)||=1, A>0; T is the pulse duration; tmax is the pulse envelope maximum location; n = ne + ∆ is the parameter determining the envelope attenuation rate, n>0; ne is the critical value of parameter n which is calculated relatively T and tmax so that the pulse amplitude is not more than 5% from the maximum value at the definition domain boundary; f is the basic frequency in Hz; ∆ is the increment of the parameter n relatively the critical value; 0≤t≤T.  At the first stage of the investigations, we checked the EMD properties in the conditions of artificial distortions imitating overlapping of extended trends on the initial signal. It was discovered that the frequency selectivity allows us to remove the trend. Manifestation of this property is illustrated in Fig. 4. In the left part of it is a numerical experiment with linear trend, overlapped on a noise-free pulse Berlage sequence. The right part is the case with overlapped nonlinear trend. It is clear that in both cases the trend is rejected at the first EMD component. The only thing that should be taken into consideration when applying EMD to suppress trend in a signal, is the so called «boundary effect» occurring due to the artifacts of the interpolation included into the processing algorithm. There are several ways to eliminate this negative effect [9,11] but discussion of this topic is out of the scope of this paper. Another effect, discovered in the course of a numerical experiment, is the capability of solving another frequent problem of induction on observation means from a power line (50 Hz and its harmonics). The result of pulse signal processing under the effect and multiplicative noise is shown in Fig. 5.

Solar-Terrestrial Relations and Physics of Earthquake Precursors
In the first case, the initial sequence of Berlage radio pulses with the relative basic frequency f = 0.1 was added to sinusoidal signal simulating harmonic signal induction with the relative amplitude A = 16 and relative frequency F 11 times as less as the pulse basic frequency. We can see that EMD first component eliminates the induction.
In the second case, multiplicative induction was simulated with the same frequency F. Compared to the previous case, a signal with fewer distortions was obtained only in the fourth EMD component. However, this fact can be considered in favor of the method as long as detection of the expected form of a pulse signal pattern in a certain component indicates automatically the modulation type.
Further numerical experiments with variations of frequency F and phase P of an overlapped harmonic signal showed that in the case when the basic frequency of Berlage pulse f ≪ F, noise additive overlapping is analogous to the case of nonlinear trend considered above, and the multiplicative noise overlapping is interpreted as «periodic fading» with the frequency F/2. In an opposite case, when f ≫ F, the high-frequency harmonic noise is easily rejected by EMD first component like random noise which will be discussed below. In case when f = F, the additive mixture of noise with a harmonic signal / , 03008(2018) e3sconf/201 E3S Web of Conferences 62 8620 https://doi.org/10.1051

Solar-Terrestrial Relations and Physics of Earthquake Precursors
3008 causes pulse stuffing phase change by the value of a scalar sum of interacting signal phases and almost does not reflect on the form of EMD detected signal of the first component. During in-phase multiplicative noise, only the Berlage radio pulse modulating component amplifies that does not prevent a pulse from appearing in the first component. As for the multiplicative in-phase noise effect, this case is interesting in the recovery of a radio pulse symmetry built-in component calculation mechanism. The results of numerical estimates for the latest case are illustrated in Fig. 6.   Fig. 6. Results of EMD application to process the Berlage pulse sequence with multiplicative harmonic noise coinciding in frequency with pulse frequency stuffing but relative variable phase shift of noise by the values of 0.1, 0,3, 0,7 and 0.9 π, respectively. The next step of investigation was aimed at finding the EMD capabilities during noise pulse signal processing. Artificial noise was generated by a standard function of pseudorandom sequence of Java language standard function library. To estimate the efficiency of pulse signal denoising, we applied an index equal to the relation logarithm of the sum square of pulse and noise dispersions to noise dispersion square. For stationary noise, this index is proportional to the classical criterion, the relation of a sum of signal and noise energies to noise energy which is applied in radio measurement practice.

Solar-Terrestrial Relations and Physics of Earthquake Precursors
At this stage of the investigation, the experiments were carried out with pulse signal model with overlapped stationary noise. Berlage pulse generation parameters are similar to the parameters of the signal illustrated in Fig. 3. The noise relative dispersion was 20. Fig. 7 presents the results of a numerical experiment where EMD is applied directly to suppress the additive noise in a group of four Berlage pulses. It turned to be interesting that very often in the experiments the pulse signals with small index (6) were not detected in the first component and manifested the most clearly in the EMD second and even in the third component. Table 1 shows the results of index (6) measurements, obtained for the signal represented by zero component in Fig. 7. The pulse-to-pulse amplitude decreases twice as much in the initial signal. The particular interest in the context of the investigation is the case of EMD application during the impact of a powerful additive nonstationary noise, completely masking the useful pulse signal (Fig. 8 (a)). Four Berlage pulses of the same amplitude are almost hidden by additive nonstationary noise. We can note that even in this case, the possibility of pulse detection is real as well as the recovery of their form that is obvious and / , 03008(2018) e3sconf/201 E3S Web of Conferences 62 8620 https://doi.org/10.1051

Solar-Terrestrial Relations and Physics of Earthquake Precursors
3008 offers the challenge to obtain good results in complicated field conditions. In Fig. 8 (b) EMD was applied after the low-frequency filtration allowing us to decrease the nonstationary random noise effect.
The results show quite high efficiency of EMD application. It manifests the most vividly in noise areas with relatively small index (6) lying within the limits from 0 to 3 dB.
(a) (b) Fig. 8. Demonstration of EMD application to a pulse signal consisting of four amplitude-equal Berlage pulses masked by nonstationary noise without filtration (a) and with filtration (b).
If we compare the figures, we can note that there is one more EMD property which differs the method principally from the classical frequency filtration. When applying the EMD where the noise level becomes significantly less than the useful signal level, it is more likely to detect it in the component with a less number and vice versa. This property is not associated with the method frequency selectivity (Fig. 8 (а)). In this case, the more the «energy capture» of a certain component pulse is, the less we have the chances to detect manifestation of this pulse in other components that is quite explainable by the energy conservation law. The distribution does not depend on pulse energy but depends only on noise spectral properties and its energy distribution in time. During a band-pass filtration, in case of stationary noise, signal energy distribution over each of frequency bands would be unambiguously determined for any of the pulses similar in form and would be proportional to the energy of each. In case of non-stationary noise ( Fig. 8(b)), as the practice shows, during the filtration, we observe significant distortions of pulse pattern form and, as a consequence, of useful signal spectral content at the filter output. Qualitative and quantitative assessments of such distortions are the subject of further investigations.
One more conclusion from the experiment is that filtration significantly promotes more predictable result of a useful signal occurrence in EMD certain component. nthesize even a quasi-optimal multi-channel filter. In conclusion of the researches of EMD applicability to denoise GAE pulses, we present several patent examples of practical application of the method. However, we should make some notes to understand the sense of the following experiments with real GAE. As it was already mentioned, the property of frequency selectivity is known for EMD method. In this case, at each EMD algorithm iteration, we detect the lower-frequency component with the fewer number of signal function local extrema distinguished for the interpolation. Thus, according to the conclusions, which were obtained by other authors in the course of investigation on mathematical models [11], we should initially provide at least a tenfold margin in data sampling frequency (twenty samples on the period of the highest-frequency component of a

Solar-Terrestrial Relations and Physics of Earthquake Precursors
3008 signal informative part) at each iteration of the algorithm for the analysis applying a typical EMD. Hence, a computer program, realizing EMD method, was focused on the processing of long-duration signals. Signal sampling frequency in all the experiments with real data was determined by an audio card digitizing at 48kHz. It also determined the frequency range which was 20 kHz. To meet the requirement of tenfold margin of the sampling frequency, according to A.V. Kotel'nikov theorem (Nyquist theorem), the signals were filtered by low frequency with the bandwidth of 2000 Hz. Episodes from randomly selected files of relatively long duration were taken from the archives of long-term GAE data base of IKIR FEB RAS and tested.
In order to demonstrate the qualitative differences of the band-pass filtering method and EMD method with real data, we carried out a numerical experiment with a series of detected GAE pulses. Fig. 9 illustrates the results of application to a typical GAE pulse ( Fig. 9 (a)), of low-frequency filtration ( Fig. 9 (b)) and EMD (Fig. 9 (c)).
It is clear that filtering gives a comparatively good effect of signal denoising in case of stationary high-frequency noise and long enough GAE pulses for high signal to noise ratio. However, during short pulses and large amplitudes, the pulse spectrum width becomes wider and the effects of filter frequency response variations and phantom oscillations clearly appear at low-frequency filter output. It does not occur when EMD is applied. On the contrary, when the peculiarities of sampling frequency selection are observed, stable detection of a useful signal is observed at higher-order components. In this case, as it was already mentioned, when EMD is applied, the structural envelope response of the signal under analysis is self-regulated.   Fig. 10 shows an example of EMD application to the problem of GAE pulse detection for low index (6). Fig. 10. Detection of a useful pulse signal for low signal to noise ratio.
All the results discussed above of effective GAE denoising are, firstly, associated with their important structural property, in particular, with the oscillating process symmetry relatively its average value. In case of asymmetric oscillations, phantom symmetric pulses appear. They are generated by the algorithm of iterative applications of interpolation to each of the subsequent difference of upper and lower envelopes of a sequence under analysis. This negative effect is clearly seen in Fig. 11 where we represent the first two EMD components applied to the induction from a car engineignition system on instrumentation cable runways of a hydrophone for GAE signal reception.
Finally, we should note again that the described EMD method is free from binding to the physical essence of processed data, noise signal distribution statistics, destabilizing fluctuations as well as to global and local trends and can be applied efficiently to denoise geophysical data where oscillation activity bursts are to be detected at background noise. The method does not impose any limits on useful signal spectral content, its duration and amplitude-phase features of its structure. The presented data of numerical experiments on artificial and natural data showed that EMD method / , 03008(2018) e3sconf/201 E3S Web of Conferences 62 8620 https://doi.org/10.1051

3008
-effective in GAE denoising that manifest in the decrease of signal to noise ratio at the processing algorithm output from 3 to 6 dB depending the signal to noise ratio at the processing algorithm input; -allows us to eliminate any kinds of trends; -is capable of detecting very noisy GAE signals; -allows us to recover significantly the form of GAE pulse amplitude-phase relations and performs it more qualitatively than signal frequency filtering that promotes the possibility of more effective application of cross-correlation method for GAE pulse detection and identification.
However, when oscillations asymmetric relatively its average value occur, pulse signal phantom patterns appear at EMD output that should be taken into consideration when detecting GAE.