Technical vision for monitoring and diagnostics of the road surface quality in the smart city program

. This article is devoted to the research and development of methods for the automated detection of road surface defects in offline mode. The article discusses the problems encountered in the operation of an automated road scanner (ARS), as well as the modernization of the system to solve these problems using computer (machine) vision and a Field-Programmable Gate Array (FPGA). The work uses deep learning methods and analysis of various architectures of neural networks. About 100 terabytes were collected and tagged to train the neural network for recognizing road defects. It is worth noting that the task of recognizing defects in the roadway is one of the most difficult even for the human eye, since the contours merge with the defect. During the study, a board was developed to collect telemetric data from road scanner devices. To store the collected telemetry characteristics, a large data storage was developed with replication and synchronization functions.


Introduction
In the modern world, auto-road highways are one of the main parts of the transport infrastructure, that allow vehicles to pass by freely around the clock at any time of the year and under different weather conditions [1].
Directly, the quality of roads is influenced not only by how well it was engineered and how high-quality materials were used during the paving operation [2], but search, diagnosis and prompt repair of found defects resulting from its operation also plays a huge role [3].
It is for diagnosing the roadway that an automated road scanner (ARS) was developed, which includes:  system of measuring the longitudinal evenness;  a system for fixing defects, elements of horizontal carriageway marking, etc.;  fixation system of road facilities;  cross-flatness measurement system with rutting, elevation of defects (for example, pothole depth [4]);  dual-band radar sounding system;  position control system, including global. This article will discuss the methods of implementation, functions, automatic recognition of damaged areas of the pavement [5][6].
The following methods can be used to accomplish the task:  Scanning method using high-precision laser scanner (3D scanner)  Recognition method using linear cameras and machine vision algorithm;  Method of building point cloud using LIDAR; Each of these methods has its advantages, disadvantages and nuances. In order to choose one of the presented methods, it is necessary to consider them in more detail.
Scan method using a high-precision laser scanner (3D scanner). This method uses a high-precision laser scanner with triangulation. Devices with triangulation lasers, due to the constant progress achieved in the field of electronics and sensors in recent years, have proven to be a reliable technology for contactless surface measurements [6, p. 16].
Laser triangulation is an active stereoscopic technique that, based on the principle of topographic direct intersection, can determine the position of a point in space, determined by the instrumental reference system. According to the scheme in Picture 1, a laser emitter generates a beam of energy that comes from a generator at an angle (α), which is known for pre-calibrating the mirror. It falls on the measured surface of the object at point (A), after which the laser beam is subjected to reflection, where the amount of reflection depends on the type of surface; a part of the reflected signal falls on a receiving sensor located at a known distance, called the baseline (b), from the radiator. The angle (β) of the reflected beam is unknown, but it can be calculated using trigonometric formulas. Knowing the focal position (c), the position of the laser spot (Px, Py) and the angle of the reflected beam, we can find the coordinate of the scanned point. Repeating this operation for all points in which the object surface can be discretized, one can determine their coordinates, and then digitize the surface through a three-dimensional cloud of points [7].
After the triangulation process, the found coordinates of points can be transformed into a three-dimensional surface of the object. In accordance with the technology, the type of trace projected onto the surface of an object, lasers can be classified as systems with "one spot", "line" or "multilinearity" [8].

Recognition method using linear cameras and machine vision algorithm.
For this method, linear cameras, specific light sources and object recognition software are used. The use of a linear camera (line-scan) allows you to take pictures of the roadway with great detail. These types of cameras are used primarily in machine vision, since the Picture is formed by scanning an object. A sensor that is used in a linear camera contains only one or several rows of pixels, unlike cameras with a matrix sensor, which in turn allows:  Small but critical out-of-sync data received from all scanner systems;  The processing time of the data (manual processing). Due to the fact that scanning occurs at a relatively high speed, an average of 70 km / h, it is necessary to take into account all the slightest delays in the operation of the ARS systems. This problem is solved by a synchronizer on the FPGA (Field-Programmable Gate Array), which allows you to synchronize control signals for scanner systems. In turn, an FPGA is a semiconductor device that can be configured by the developer after manufacture, programmed in the VHDL design language [9]. A typical FPGA circuit consists of three blocks: logical blocks, Input / Output blocks, and programmable keys. Logical blocks are used to implement the basic binary operations OR, NOR, AND, NAND, XOR. I / O blocks are designed to exchange signals through the external outputs of the microchip. Programmable keys, in turn, create connections between internal microchip blocks [3].
The FPGA-based synchronizer allows you to change the configuration of the ARS without changing the hardware of the device. And it has the following functions: "Multiplier / Divider", "Front Delay", "Quadrature Encoder" [5].
Picture 2 shows the 3D model of the synchronizer based on FPGA. Using the example of a defect fixation system, which consists of three linear cameras, one can be convinced about the effectiveness of this device. Before entering, the error in the number of shots was 10-20 shots per 100 meters, after the introduction the out-of-sync was reduced to 1-2 shots per 100 meters.
Most of the processing time of the received data from cameras and sensors takes manual selection of markings and pavement defects, which in turn increases the total time for diagnostics of the road surface. To solve this problem, a computer vision system was introduced, which automatically identifies areas with markings and defects, and also calculates their areas [4], [6], [9].
In general, machine vision is a section of robotics that uses Picture analysis to solve industrial problems. The use of machine vision in this project can significantly reduce the processing time.
Typical machine vision systems consist primarily of a camera, a computing device, a specialized light source, and software. Since the cameras are already installed on the ARS and the output from the camera is stored in the onboard system's external memory, the recognition process will occur over the finished Pictures immediately after scanning.
The recognition program is implemented in the Python programming language using libraries and extensions:  OpenCV -library of computer vision, Picture processing and general-purpose numerical algorithms;  NumPy is a library with support for multidimensional arrays and high-level math functions;  Matplotlib -a library for visualizing data from two-dimensional (2D) graphics (3D graphics are also supported);  PyQt -set of "bindings" of the Qt graphical framework for the Python programming language, made in the form of a Python extension.
The algorithm of the program in general consists of the following steps: Picturecv = cv2.imread(full_path,0) Loading an Picture into the buffer in the form of a matrix with full_path parameters is a variable in which the path to the Picture is stored, 0 -load the Picture in grayscale, where the value of each pixel corresponds to its intensity. Blur the Picture to remove the noise that occurs when shooting the roadway. Next, the Picture threshold value is determined if the intensity value of the pixel is greater than the specified threshold value, it is assigned one value (may be white), otherwise another value is assigned (it may be black). That is, the Picture is binarized, where each pixel is assigned only one of two values, depending on the set boundaries. Edges = cv2.Canny(th, 1, 255) Detection of the edges of an object in a given range, because before this, the Picture was binarized, the border would be between white and black pixels, where white pixels are the object and black ones are the background. Adding the found contour to the original Picture. Picture 3 shows the original Picture of the roadway and the Picture with selected areas [8], [7].
The use of computer (machine) vision can significantly reduce the processing time of the received data from the ARS. The average processing time for one Picture is 2-5 seconds, which is significantly less than manual processing [10].

Conclusion
In conclusion, we would like to note that the use of a synchronizer is not a panacea, since it was not possible to get rid of out of sync completely, to solve this problem, it is necessary to rethink the concept of the entire project. In turn, the introduction of machine vision can significantly reduce the processing time of the Picture with ARS, but to increase accuracy, the introduction of machine learning algorithms is possible.
The reported study was funded by RFBR, project number 19-29-06036.