Extension of training set using mean shift procedure for aerospace images classification

An effective method of training set extension for aerospace images classification is proposed. The method is based on mean shift procedure with respect to spatial information. It allows considering the unlabeled data structure. The results of experimental study using the Salinas hyperspectral image are presented, proving the effectiveness of the proposed method.


Introduction
In image classification tasks, a training set (labeled data) is necessary for constructing a decision rule or training neural networks [1,2]. The process of obtaining a training set (TS) for aerospace image is often associated with significant material and time costs. Therefore, in practice, TS is available only for a small number of classes which is of interest to the user and is non-representative at the same time. Some classes can be represented by several labeled pixels.
It is known [3,4] that in order to ensure an acceptable classification quality the minimum number of TS points for parametric classifiers should be of 10k per class order (where k is the dimensionality of the feature space), and for non-parametric classifiers should be of 50k per class. Therefore, the problem of obtaining a representative TS is especially relevant while processing hyperspectral images for which the number of spectral channels (features) is measured in hundreds.
At the same time, a characteristic feature of aerospace images classification tasks is that a large amount of unlabeled data is always available while solving them. In such circumstances the training set can be extended with the help of unlabeled data using methods based on clustering algorithms. Parametric methods based on the EM algorithm were most widely spread among such methods [5]. However, when processing aerospace images any a priori data on the probabilistic characteristics of classes are absent, as a rule. Therefore, the application of these methods can lead to unsatisfactory results.
A method of training set extension based on the nonparametric algorithm soft-PARZEN is proposed in [6,7]. In this method "rigid" constraints on the form of conditional density function are not required. However, its application requires the presence of training sets for entirely all classes in the image. In practice, this condition is provided very rarely. In addition, its application for processing aerospace images is associated with unacceptably high computational costs.
A common approach to the training set extension is based on the mean shift procedure. In [8] a semi-supervised support vector machine method is proposed. The support vector machine is applied to unlabeled data in such a way as to minimize an error when classifying both labeled and unlabeled data. The main disadvantage of the method is that the objective function for such a task is hard to optimize because it is not a convex one [9]. To solve this problem, a computationally effective gradient descent optimization method is suggested in [10].
In this work the method of training set extension based on mean shift segmentation with respect to the image spatial information is presented [11]. To demonstrate the efficiency of the proposed algorithm the constructed training sets were applied in order to classify the hyperspectral image of Salinas [12] with the Support Vector Machine (SVM) method using radial basis functions.

The proposed method of training set extension
The algorithm of the training set extension with the parameters {ℎ , ℎ , , } can be divided into three stages.
At the first stage, the image is segmented using the mean shift procedure with respect to spatial information. For this purpose mean shift procedure is applied to each pixel of the image. Only the pixels located at a distance of no more than ℎ in the image domain and no more than ℎ in the feature space [11] are taken into account when calculating the coordinates of a new center. The Euclidean distance between vectors is used as the distance between pixels. The distance is determined by the spectral brightness vectors in the feature space, and by the pixel coordinates vectors in the image domain. After that, segments with close centers (at a distance of no more than ℎ in the image domain and no more than ℎ /2 in the feature space) are combined and a representative is calculated for each segment (the average value of the feature vectors of all the pixels related to the segment).
At the second stage, a set of segments is formed for each class of the initial TS containing the points of the initial training set for this class. The segments from the obtained sets that simultaneously contain points from several classes of the initial TS are deleted. The appearance of such segments is possible in 2 cases. First, the parameters of the mean shift procedure can be unsuccessful. Second, errors can appear in the initial TS. Using the points from such segments in TS extension can lead to classification errors. Then, each set a extended by the segments located in the feature space at a distance of no more than from it. The distance to a set is calculated as the smallest of the distances to its elements. The Euclidean distance between representatives is used as a measure of the distance between segments. After that, the segments that are simultaneously included in several sets are deleted again. These segments are located in the feature space on the boundary of the classes represented in the initial TS.
At the final stage of the algorithm, randomly selected points from the segments ( percent of the segment size) which are included in the set formed for this class are added to the initial TS for each class.

Experimental research
The proposed method of the TS extension is implemented in the C++ programming language (using Microsoft Visual Studio 2017 IDE) using the OpenMP standard.
A Salinas hyperspectral image (the Salinas Valley, California) measuring 512 × 217 pixels obtained from the AVIRIS sensor on October 8, 1998 was used in the experiment [12]. The features constructed by the principal component analysis were used for processing. Fig. 1 shows the image in false colors (the first three principal components) and the ground truth image for 16 classes. The initial training sets containing 48 points (3 points per each class represented on the image), 80 points (5 points per class), and 160 points (10 points per class) were randomly formed according to the ground truth image. Each of these sets was extended by 5, 10, 15 and 20%. The parameter value = 10 was fixed. The extension of the sets was carried out using the first 4 principal components. The transition to the principal components is due to the high computational complexity of the mean shift procedure associated with the need for multiple calculation of the Euclidean distance in a multidimensional feature space. The resulting sets were used to classify the image using the SVM algorithm based on radial basis functions. The classification was carried out in the feature space obtained by the principal components analysis. The implementation of the SVM algorithm included in the Exelis ENVI software package was used. The default values were selected for all configurable parameters of the algorithm. The accuracy of the classification was defined by comparing the obtained classification images with ground truth one. Black pixels on ground truth image were ignored.
To average the results, the initial training sets were formed three times and for each of these sets the experiment with the same set of parameters was repeated three times. The obtained accuracy values of the classification (in percents) are given in the table. Fig. 2 shows the dependence of the average and maximum classification accuracy on the value of parameter . The value = 0 corresponds to the initial TS.

Conclusion
The work suggests an effective method of the training set extension for the tasks of aerospace images classification that allows to take into account the structure of unlabeled data. An experimental research using Salinas hyperspectral image showed that the proposed method is effective even for a small size of the initial set (3 points per class).