Nonnegative Matrix Factorization of time frequency representation of vibration signal for local damage detection – comparison of algorithms

. Local damage detection in rotating machine elements is very important problem widely researched in the literature. One of the most common approaches is the vibration signal analysis. Since time domain processing is often insufficient, other representations are frequently favored. One of the most common one is time-frequency representation hence authors propose to separate internal processes occurring in the vibration signal by spectrogram matrix factorization. In order to achieve this, it is proposed to use the approach of Nonnegative Matrix Factorization (NMF). In this paper three NMF algorithms are tested using real and simulated data describing single-channel vibration signal acquired on damaged rolling bearing operating in drive pulley in belt conveyor driving station. Results are compared with filtration using Spectral Kurtosis, which is currently recognized as classical method for impulsive information extraction, to verify the validity of presented methodology.


Introduction
In mining industry all over the world, belt conveyors are one of the most essential assets of horizontal transportation [1,2]. In underground mining they span over the entire mine forming complex network transporting excavated material towards the shaft, while in surface mining they can also transport the material over very long distances, up to many kilometers. Since conveyors are typically set up in series, failure of one of them can effectively stop large portion, or even entire transportation network [3]. Most of the failure events are connected to belt itself, which is the most expensive part of the conveyor, or to elements of driving station, which includes engines, clutches, gearboxes and numerous pulleys that interact with the belt. Most of machines included in driving station are made of rotating elements installed on shafts, and such architecture requires using bearings.
Rolling element bearings are prone to three main types of failure: damage of rolling element, inner race or outer race. In most cases local damage of such types results in the presence of cyclic impulses in vibration signal, very often invisible in time domain. Literature worldwide presents many different techniques for analysis of such signals, e.g. classification with artificial neural networks [4], supervised and unsupervised learning techniques [5], filtration with informative frequency band selectors [6,7], or other data exploration techniques [8].
Previous research showed that such impulsive component can be successfully extracted using Nonnegative Matrix Factorization (NMF) approach [9][10][11]. There are however several types of NMF algorithm that are characterized with different assumptions and constraints, therefore they can produce different results. In practical case it is very useful to be aware of such differences and be able to choose the most appropriate method for the analysis of such type of data.
In this paper authors analyze real and simulated vibration data measured on ball bearing installed in drive pulley. Data is processed with proposed algorithm using three different NMF methods. Results are compared based on visual inspection and quantitative statistics proposed for the comparison as quality measures. Additionally, performance of algorithms is compared with the classical method for vibration signal processing, which is the filtration with spectral kurtosis [6]. Fig. 1 presents the flowchart of the proposed methodology for all investigated NMF algorithms. After selecting real or simulated signal as input, method begins with transformation of the provided signal into a time-frequency representation. Authors use the spectrogram, which is a square of absolute value of the short-time Fourier transform (STFT). The STFT for discrete signal x 0 ,…,x N-1 is defined as follows [12]:

Methodology
where w(t-is a shifted window. Next, we apply the selected NMF algorithm to classify vectors of spectra for all timestamps from the spectrogram, which is performed based on so called encoding matrix generated by NMF. Such clustering provides K spectrograms where K is number of clusters. Finally, method applies the inverse short-time Fourier transform (ISTFT) [13] to all selected time-frequency map to obtain the informative signal. To prove that the proposed methodology is superior to the classical approach, authors propose to compare obtained signals and their envelope spectra with signals and envelope spectra after Spectral Kurtosis filtration.

General model of NMF
Nonnegative Matrix Factorization (NMF) is a family of algorithms in linear algebra and multivariate analysis where a nonnegative matrix V is factorized into two lower-rank nonnegative matrices W and H (see Fig. 2), hence general model is defined as = .
Since this problem is not exactly solvable, it is approximated numerically. Such matrix multiplication can be understood as computing the column vectors of estimated matrix as linear combinations of the column vectors in matrix W using coefficients held by column vectors of matrix H. If orthogonality constraint is imposed on matrix H, e.g.
= , its columns behave as vectors of posterior probability in the sense of clustering problem. Even if the orthogonality constraint is not explicitly imposed, orthogonality holds to a large extent, as well as clustering effect. Hence, it is recognized that NMF algorithms reveal internal clustering properties, and output matrices are called base matrix (W) that contains clusters' centroids, and encoding matrix (H) containing posterior probabilities of data points (columns of V or ) belonging to appropriate clusters. Additionally, under certain assumptions some NMF algorithms become mathematically or algorithmically equivalent to known clustering methods as k-means or probabilistic latent semantic analysis (PLS) [17]. Several types of NMF algorithms have been developed over the years, that differ in constrains and implementation methods. Multiplicative update rule by Lee and Seung [16] gained popularity due to simplicity of implementation. Many successful implementations are based on alternating nonnegative least squares method, but specific algorithmic approaches use projected gradient descent methods [18], active set method [19], optimal gradient [20] or block principal pivoting [21].
First of used algorithms is called Convex NMF. It imposes the constraint that centroids contained by the column vectors of base matrix base matrix: are limited to conic combinations of column vectors of the input matrix: Geometrically it can be interpreted as centroid of a cluster being placed within the convex hull of a cluster in p-dimensional Euclidean space.
Second algorithm is called Semi-Binary NMF and is very useful for hard clustering of nonnegative data assuming non-overlapping clusters. It assumes that × is characterized with the property that: , : Hence, assuming that J<T, H is a row-orthogonal matrix. In the notion of clustering it can be interpreted as assigning data points to appropriate clusters with 100% certainty.
Third and final approach is a classic, unconstrained implementation of NMF with Euclidean distance as the objective function, using multiplicative update rules.

Simulated data experiment
Data simulation was aiming to capture general structure of the signal, while being able to control the parameters of impulsive component (see Fig. 3). Firstly, Autoregressive Moving Average model (ARMA) was used to model the behaviour of the healthy machine. As a next step, convolution was used to create damage signal combining healthy component with impulsive excitation, which is a 10 Hz comb of impulses contaminated with small amount of Gaussian noise to prevent the simulated signal from taking zero values between impulses. Amplitudes of impulses were also randomized to introduce more natural diversity. Signal was prepared in a way that it wouldn't be easier to process than the real one. Average energy of the impulses was adjusted so that they are visible but not standing out very much in the time series (see Fig. 4). We were also evaluating the spectrogram of the simulated signal, so that it wouldn't end up being easier to work with than the one of the real signal (see Fig. 5).  Signal prepared that way was fed into the procedure. Selected partial spectrograms as well as recovered time series of informative signal for all algorithms are presented in Fig. 6. Filtered signals are presented in Fig. 7. It can be seen that along with very satisfying results, algorithms perform with similar precision.  Another observation regards the amplitudes of output signals. It is clear that for classic Convex NMF impulses are roughly 50% weaker in comparison to the other two, which are quite consistent in terms of the amplitude. It comes from the fact that in all cases single impulse is not constructed from one spectra vector of the spectrogram, but always from a few of them (typically 3-8 slices per impulse). Convex NMF selected less slices per impulse than the other algorithms, hence ISTFT procedure produced signal with lower amplitude of impulses. It makes this algorithm more precise.

Comparison to Spectral Kurtosis
Presented algorithms have been compared with well established method of signal filtration with the Spectral Kurtosis (SK) that is used as the filter amplitude characteristic. The general principle of operation is to calculate kurtosis value for each frequency bin of the spectrogram. As a result, one can obtain indicator of impulsiveness across the signal spectrum, which allows to determine which frequency bands carry the most visible information about the damage-related impulsive behavior. Vector of spectral kurtosis can be then used as a filter to extract those impulsive components from the signal. Calculated SK is shown in Fig. 9, and filtered signal is presented in Fig. 10. Signal is very similar to the input in terms of shape, but its amplitude is significantly lower. It proves that SKfiltration is not an appropriate method for impulses recovery.    Frequency components between peaks are also stronger and disrupt the quality and clarity of the spectrum. It is also noticeable that after disappearing at about 60 Hz, peaks then return at about 130 Hz, just to decrease the amplitude again after a few harmonic components.

Application to real data
Presented method has been tested using vibration data from damaged rolling bearing of the drive pulley operating in belt conveyor driving station in mining industry. Fig. 12 presents time series of recorded signal, and its spectrogram is shown on Fig. 13. Sampling frequency is equal to 19.2 kHz for 2.5 seconds of the measurement. Several wide-band excitations are visible on the spectrogram at carrier frequency band 1-6 kHz that occur with the modulation frequency of 12.7 Hz.  Similarly to the simulation case, signal was analyzed using presented method with all of the NMF algorithms. Results are presented in Fig. 14 in the form of constructed selected partial spectrograms (left panels) and extracted time series (right panels).     Fig. 16 provides the functions to determine cyclicity of the output signals. Period is frequency detected based on spectra is equal to 12,7 Hz. Autocorrelation functions reveal common cyclicity as well.

Comparison to Spectral Kurtosis
As in the case of simulated signal, we compare the results to the Spectral Kurtosis filtration. Vector of SK is presented in Fig. 17 and filtered signal is shown in Fig. 18. In this case, SK filtration made a significant difference to the signal. Most of the strongest impulses became visible but energy of the noise is clearly too high, hence a lot of impulses are lost in the noise. However, cyclicity analysis does not look that promising (see Fig. 19). Behavior of autocorrelation function is not consistent with obtained results and its cycle denotes low-frequency modulation visible in the original signal, that is probably related to shaft misalignment. Envelope spectrum does not provide satisfying result either. Peak of fundamental frequency component has very low amplitude and is difficult to notice and interpret, especially for an automatic procedure. Fig. 19. Autocorrelation function and envelope spectrum for SK-filtered result.
To be able to come to objective conclusions, input signals and obtained results have been described with quantitative measures presented in Tab. 1., that is: Signal-to-Noise Ratio (SNR), Signal-to-noise-and-distortion ratio (SINAD), Total harmonic distortion (THD), Kurtosis, Factorization error defined as Frobenius norm of difference between input spectrogram matrix and full spectrogram matrix reconstructed after factorization, divided by Frobenius norm of input spectrogram matrix. Of course, for input signals and results obtained by SK filtration this error is not defined and denoted as 0 value.

CONCLUSIONS
In this paper authors present a comparison of selected NMF algorithms' performance for the application of local damage detection based on vibration data. Algorithms have been tested using simulated as well as real-life data from damaged rolling bearing operating in belt conveyor driving station in mining industry. Authors use previously developed processing method that can employed three different NMF algorithm in the same way. Results show that algorithm to be selected for such task can be neither too sensitive because of false-positive impulses occurrence, nor too resistant, since the information can be lost. In considered scenario Semi-Binary NMF algorithm turned out to be optimal for discussed analytical strategy.