Mathematical modelling of an airborne technical vision system development

,


Introduction
Depending on the aircraft class and the list of tasks to be solved the aircraft technical vision system is represented by a different set of multispectral sensors.It may consist of color or monochrome television cameras, thermal imaging cameras, radar, lidar, ultrasonic sensors [1].The specific set of onboard aircraft sensors depends on the type of aircraft and the tasks it solves.The images formed by the sensors have a different physical nature, a different mathematical relationship with the different environment.The image from television cameras is obtained in a familiar to the human eye form.However, images from other sensors require special preprocessing before they can be displayed to the pilot.They should be preprocessed, which may include image enhancement, multi-spectral image fusion.At a higher level, the tasks of fusion of a real television or thermal image (RI) with a virtual image (VI) synthesized from a digital terrain map is solved, the tasks of moving objects detection and tracking, etc.Another of the most important tasks solved as part of the onboard technical vision system is a reconstruction of a 3D model of the underlying surface in the Earth's plane using a stereo images pair sequence.The whole complex of these tasks and special-purpose tasks should be solved in real-time -in 0.04 seconds.It is high strictly requirement and it inevitably affects the choice of a method for solving each subproblem of the complex.
The mathematical concept of one of the subsystems of the aircraft technical vision system, designed to solve one of the most important problems -the problem of combining the radar image and visual image, as well as solving the inverse navigation problem on this basis, was first published in [2,3].The mathematical software for the above subsystem of the aircraft technical vision system includes two blocks of tasks to be solved: a block of primary image processing tasks and a block of high-level tasks.
The primary processing unit solves the problems of image noise suppression, image enhancement, image fusion and edge detection.Algorithms for this group of problems should have the lowest possible computational complexity and, at the same time, should provide a high-quality problem solution.To suppress discrete Gaussian noise in images, a modified version of the sigma filter is proposed [4].Sigma filter for noise suppression is chosen due to its simplicity, a good character of noise reduction among other nonlinear filters and the ability to preserve the boundaries of the brightness difference with the correct choice of the cutoff threshold [5][6][7].But there is a problem of correctly determining the cutoff threshold on a sigma filter applying.According to threshold computation by formula , which includes -the value of the noise level root-mean-square estimation (RMS), -the specified parameter (recommended value ).The correct threshold value allows to effectively suppress noise while maintaining the boundaries of the brightness difference.Below we consider a fast algorithm of the image noise estimation which gives the estimation of the noise RMS close to the true values.
The Canny detector [8] is rightly considered as one of the best edge detectors.However, it has a rather large computational complexity and creates a large number of short lines in the resulting contour image.In [9], an original algorithm is proposed, which has a computational complexity 2-3 times lower than Canny's algorithm, and at the same time creates a contour image with a minimum number of short lines.
In the high-level tasks block, one of the most important and most difficult is the task of the images fusion -real television or thermal image, on the one hand, and virtual, on the other [1].The correct fusion of real and virtual images makes it possible to create an image, which significantly increases the crew situational awareness in poor visibility conditions.Combining the real and virtual images allows to simultaneously solve the inverse navigation problem -to correct the erroneous values of the navigation parameters.The aircraft position in space is determined by the six-dimensional vector of navigation parameters .The first three components of the vector are the coordinates of the aircraft, and the last three are the orientation of the aircraft in space (Euler angles).According to these data and a digital map of the area stored in the on-board computer memory, a virtual image is formed.Aircraft sensors give values of vector parameters with some unknown errors generally.The estimation can be obtained by fusion virtual image with real.It is necessary to determine such a correction to the values of the navigation parameters vector so that the virtual image generated from the corrected vector matches as closely as possible to the real image.
The virtual image contains only the contours of objects that constantly present on the Earth's surface.Therefore, before solving the actual problem a real image and virtual image fusion, it is necessary to convert the real to the form of a contour image.Actually images fusion is to determine a correspondence between the points of the real and virtual contour images.There are many ways to solve this problem.Correlation algorithms are among the most efficient in terms of the geometric alignment accuracy of the real and virtual images and the accuracy of solving the inverse problem [10][11].However, these methods require the construction of the digital map views and for each view require the criterion function computation.As a result, the computational complexity of the method turn out to be unacceptably large, excluding its application in a real onboard technical vision system.An alternative way is to use affine or projective transformations for geometric alignment [12][13].The computational complexity of this group methods is rather low, allowing realtime operation.The only drawback is the complexity of the inverse navigation problem solving within this approach.In [14], a combined method for image fusion and the inverse navigation problem solving was proposed.At the algorithm launching stage and when the alignment fails, a modified version of the correlation algorithm, which has low computational complexity, executes.After that, the tracking mode is turned on, in which the navigation parameters values are estimated in a sliding mode as predicted values from the previous real values.

Noise reduction
We will consider the additive image model The filter is needed to preserve the boundaries of the brightness difference.It is important because the next system task is border detection.In (3) m  is the cut-off threshold,  is a RMS noise estimation, m -the specified parameter.The correct sigma filter operation depends on parameter  accuracy used to calculate the threshold value.
There are many methods of an image noise level estimation, in particular, median methods [14], block methods [15][16][17][18] and methods based on the wavelet transform and Fourier transform [19,20].In computational complexity and accuracy of estimation, the method briefly described in [18] is most suitable for use in an onboard computer.The method is quite simple.The image splits into the same size blocks.The block with the minimum variance is selected from the set of blocks.The image in a block is smoothed by the simplest linear filters with kernels (masks) of certain sizes.In mentioned work, it is recommended to use masks of sizes 9x9 and 11x11.Then one smoothed image is subtracted from another and the sample variance of the remainder is calculated.Based on a sample variance, the noise variance estimation of the original image is calculated.
In [18], the translation rule from a sample variance to the noise variance estimation of the original image is not given.So it had to be restored and tested in a series of model experiments.The algorithm is based on the following reasoning.If is the linear smoothing filter acting according to the rule and is uncorrelated Gaussian noise with zero mathematical expectation, then the variance of the smoothed random component will be found by the formula From ( 5) follows that the noise standard deviation estimation can be found by the formula Subtracting the results of one smoothing from another is equivalent to the operator .This operator should suppress the smooth lowfrequency component of the image.The mask of this operator will have the size .The mask weights in the first two and last two rows and columns are equal to . There are such coefficients.The remaining coefficients of the inner submatrix are the size of .As a result, in accordance with the formula ( 5) or (7) In accordance with formula (6), the estimation of the noise RMS in the original image will be found by the formula For example, for 5x5 and 7x7 masks formula will have a form .
Table 1 shows the results of the imposed noise estimation using the described algorithm.In this experiment, a typical image of the underlying surface was taken as the initial image (Figure 1) and was imposed with the noise of a given intensity .Smoothing was done by masks of и size.The sizes of an original image divided blocks were varied from to with a step of 25.We can note good results of noise RMS estimation when dividing the original image into 25x25 and 50x50 blocks for all noise intensity values except .The computational complexity of the algorithm can be reduced by applying vector masks for smoothing with the weight coefficients calculated by the formula (9) Table 2 shows the results of the imposed noise estimation for the same image and for the same block sizes and imposed noise intensities set.To images smoothing, a pair of filters with vector masks of lengths 5 and 7 were used Table 1 and Table 2 comparison allows to make two main conclusions.First, the proposed method, in contrast to the method [18], gives stable RMS estimation regardless of the image block sizes and for all noise intensities.Secondly, an image fragment with a vector mask convolution is performed faster than with a matrix mask.Also, the estimation does not undergo significant changes when only a part of the block rows is used.For Table 2 case smoothing was implemented only in every 4th row of the block.

Edges detection
Although the brightness difference boundaries detection is an auxiliary task, the accuracy of the real image and virtual image alignment depends on the contour image quality obtained as a result of brightness edges detection.As was mentioned in the introduction, Canny's method is one of the best edge detectors.However, taking into account the requirements for the onboard systems, it has a rather high computational complexity and at the same time create a contour image with a large number of short lines.Such Canny's method feature was noted in the book [22].
The gradient-based edge detector proposed in [9] differs from the Canny method -images pre-smoothing is absent; -a vector mask applying instead of Sobel or Prewitt matrix as a mask to estimate partial derivatives; -applying a new strategy of a contour image constructing.Vectorial mask is applied to an image row or column of length.For example, the estimation of the partial derivative in the line along the coordinate is found as a convolution , where .
Convolution of image fragments in a row or in a column with a mask provides a linear model coefficient estimation that is optimal in the sense of the least-squares method, provides an optimal partial derivative estimation with respect to the corresponding coordinate.And it is smoothed estimation.Therefore, there is no need for additional image smoothing in low-intensity Gaussian noise conditions .The mask length affects on a partial derivative estimation under noise conditions accuracy.The higher the image random component intensity, the higher the parameter value should be.However, a large mask length can lead to the duplication of the selected line.Therefore, it is advisable to choose the length of the sliding window equal to 5 for images with a low noise level and equal to 7 ... 9 -for noisy images.
The MATLAB edge() function has a trech parameter for the edge detector with the default value 0.7.It provides a "strong" line guaranteed detection.Increased value of the trech parameter leads to an exceeding joining a "weak" lines to the "strong" lines.As a result, a large number of short lines are formed in the Canny detector, it makes difficult the further contour image usage.In the proposed method, the automatic threshold selection provides the minimum number of short uninformative contour lines.It's achieved by choosing the lower threshold from the histogram, according to which the value of the upper threshold is then calculated according to certain rules [9]. Figure 2 shows the contour images obtained by the proposed method in comparison with the Canny method.The contour images are derived from the image shown in Figure 1.The advantages of the contour image obtained by the proposed method are obvious.The water bodies, roads and forest clearings contours on it are much easier to detect than on a contour image obtained by the Canny method.It is important for the contour image automatic processing.

2D images fusion
The search of the most suitable mathematical methods for solving the problem a real image and the virtual image fusion was carried out in two main directions.Firstly, it is methods based on affine and projective transformations of one image to the plane of another.And, secondly, correlation-based methods of image align.To align contour images by geometric methods, it is necessary to select the residual number of corresponding pairs of points (key points) on them.This subtask was solved as follows.Closed and open contours of sufficiently large length, which can be the boundaries of objects of constant presence on the underlying surface of the Earth (roads, water bodies, runways, large buildings, etc.) were distinguished on the contour image.Using the simplest but effective algorithm [2] the contours were approximated by polygons with a minimum number of vertices and edges.The vertices of polygons on the real image and virtual image were assigned as key points.
Here is a brief description of the algorithm for approximating the contours by polygons.A movable coordinate system is placed sequentially on each contour pixel.For each m coordinate of the contour of the pixel, from one and the other sides of the origin, the values of the coefficients of the straight lines are calculated by the formula .Vectors and are formed.The signs "+" and "-" of vectors are selected depending on the quarter in which the end of the vector is located.Then the cosines of the angles between the vectors and are calculated by the well-known formula.For each contour, local maximums are determined on the set of cosines of the corners.Local maximum points are accepted as the vertices of the approximating polygon.
The parameter is on the task input.It can be used to control the approximation order.The contour can contain several thousand pixels.For small parameter value , the number of vertices of the approximating polygon will be extremely large.For the correct alignment of two corresponding contours of the real image and virtual image 20 ... 30 such points are enough.It can be achieved by increasing the parameter.Figure 3a shows runway contour, and Figure 3b shows the approximation by the polygon.The resulting polygon has 4 vertices, which are enough for the contour description.It was several mathematical models processed to find the best method of geometric transformation for images fusion and the actual images alignment based on considered models.The possibility of complex contour analysis for the alignment was also tested [3,23].It was proved that the best alignment results are achieved by the virtual image projective transformation to the plane of real image based on the homography matrix.
For the homogeneous coordinates of a virtual image points, the homography matrix for coordinates transformation to the plain of a real image has the following form , (10) or in matrix form.Usually, the homography matrix is unknown.But in the presence of a sufficient number of key points pairs , relation (10) can be considered as a system of linear algebraic equations.Each key points pair and generates two linear equations (11) There are 8 elements of the arithmetic vector are unknown in the system of equations (11), which means that to find these elements, at least 4 pairs of key points are needed.
In the general case, as a result of solving the problem key points matching, their number is significantly greater.There may be pairs with incorrect matches among all matches.The problem of 4-6 best pairs selecting is traditionally solved by RANSAC algorithm [24,25].It was shown in [13] that the RANSAC algorithm is very sensitive to correspondences mismatching errors between key points.Wrong correspondences can lead to the wrong homography transform estimation.At the same time, this work proposes another approach based on the entire set of key points direct involving.The system of linear algebraic equations (11), converted to the standard notation, has the following form .With the number of key points pairs , the system of equations ( 11) consists of equations.This means that this system of equations is overdetermined and, therefore, may turn out to be inconsistent in the classical sense.A pseudo-solution to such a system can be found from the condition .So we get a normal equation, as result: . A detailed description of the virtual image and real image fusion by projective transformation can be found in [3,13].
The RI and VI fusion based on the projective transformation of the VI to the plane of the RI has good quality and is realizable in real-time.But the inverse navigation problem remains unsolved.The problem solution was found within the joint approach.It is represented by two blocks providing search and tracking modes operation, respectively.In the search mode, the alignment is realised by a modified version of the correlation algorithm, which has low computational complexity.The search mode on a starting stage of the algorithm and in case of a wrong alignment when at least one of the six navigation parameters value leaves the predicted value.In the search mode, the alignment is implemented according to the predicted values of the navigation vector parameters [14].
In the search mode, the optimal of the navigation vector parameters estimation is found using a modified version of the correlation algorithm.To reduce the implemented algorithm computational costs, the virtual image view is being extended (Figure 4).The field of view expands vertically and horizontally by increasing the values of the pitch and yaw angles obtained from the navigation system sensors by and , respectively.Such an approach makes it possible to drop the brut search for the pitch and yaw angles while optimal alignment finding between the RI and VI.Due to this, the algorithm time execution is reduced.The next modification of the correlation algorithm is to introduce a two-stage scanning of image coordinates.On the first stage, coordinates are scanned in latitude, longitude and altitude.At the second stage, the coordinates are scanned with a halve step around the point founded at the first stage.Such an approach made it possible to reduce the algorithm time execution by more than 3 times in comparison with the scanning of all fine grid nodes.
In the tracking mode, which is the main mode, the synthesis of a virtual image from a digital terrain map is carried out according to the predicted values of each navigation parameter.The prediction is carried out according to linear models, which corresponds to the flight dynamics without distinct changes in at least one of the 6 navigation parameters.
When the values of at least one navigation parameter exceed limits, the algorithm switches to the search mode [14].

3D models of the underlying surface construction
The proposed method is designed to generate 3D images of the underlying surface from a sequence of this surface 2D images pairs, obtained from a stereo pair located on the aircraft board.The algorithm consists of several blocks, each of solves its own independent task: -a depth maps construction block based on a sequence of the underlying surface 2D images pairs from stereo pair; -a point clouds construction block based on depth maps and camera parameters matrix; -a key points detection block on and frame based on FAST algorithm [26]; -block of key points in a pair of point clouds matching based on a sparse optical flow Lucas-Kanade algorithm [27]; -block of and point clouds alignment and a rotation matrix estimation.On the first step of the algorithm, depth maps for the and frames are computed using a stereo pair.The depth map is built on the displacement basis between the corresponding pixels of the left and right images.
On the second step, point clouds and are also formed for the and frames based on the corresponded depth maps in three-dimensional space.On the third step, the multiple pairs of key points are detected.Key points are detected on the image by the FAST detector.For the detected key points, using the optical flow, the corresponding points on the image are matched.The next 8 steps of the algorithm coincide to the ICP algorithm logic [28].At these steps, the centring of neighbouring point clouds is performed, the covariance matrix is calculated, its singular value decomposition is computed and the rotation matrix and the translation vector are calculated, so the clouds are aligned.A detailed description of the algorithm is given in [29].
Figure 5 shows the result of point clouds stitching obtained from 150 stereo images pairs processing according to the described algorithm.

Conclusion
The mathematical approach of image alignment for an aircraft onboard vision system is proposed.The results of mathematical modelling of an image noise estimation are presented.The mathematical models of other problems are observed in general and literature with its detailed implementation is given.
, the smooth image component, random component, noise with zero mathematical expectation and unknown variance .The sigma filter applies to discrete Gaussian noise suppression

, Table 2 .
The noise RMS estimation based on the proposed method.