Trafﬁc Sign Detection for Intelligent Transportation Systems: A Survey

: Recently, intelligent transportation systems (ITS) attracts more and more attention for its wide applications. Trafﬁc sign detection and recognition (TSDR) system is an essential task of ITS. It enhances the safety by in-forming the drivers about the current state of trafﬁc signs and offering valuable information about precautions. This paper reviews the popular trafﬁc sign detection methods (TSD) prevalent in recent literature. The meth-ods are divided into color-based, shape-based, and machine learning based ones. Color space, segmentation method, features, and shape detection method are the terms considered in the review of the detection module. The paper presents a comparison between these methods. Furthermore, a list of publicly available data sets and a discussion on possible future works are provided.


INTRODUCTION
Advanced driver assistance systems (ADAS) are developed to enhance vehicle systems for safety and better driving. These systems can include road sensors, in-vehicle navigation services, electronic message signs, traffic management and monitoring, etc. Safety features are made to avoid accidents by offering technologies that alert the driver to potential danger, or to avoid collisions by implementing safeguards and controlling the vehicle. Their main difficulty is the perception of the environment of the vehicles in real outdoor scenes (El Jaafari et al., 2016b). Therefore, TSDR plays a critical role for ADAS. These systems aim at locating and identifying traffic signs within scene images. They have the capability of providing a large number of applications such as Driver Support Systems, Inform traffic central about abuses, Highway maintenance, Automation of driver license examinations, Intelligent Autonomous Vehicles, etc.
Traffic signs are usually divided into various categories, depending on their shapes and colors, e.g., red-rimmed triangular danger signs, red-rimmed circular speed limits signs, and blue circular mandatory signs. However, in practice, the various situations of traffic signs are complex, which makes the detection and the recognition tasks difficult for these systems.
TSDR systems are usually tackled via two-steps a https://orcid.org/0000-0001-5881-3328 approaches: detection and recognition. Some works integrate the tracking process, which associates the detected sign during a sequence of frames. In the detection process, the aim is to localize the regions that contain traffic signs within scene images. The recognition aims at labelling the detected sign depending on the information included in its pictogram. In this paper, the TSDR problem is studied by representing a survey of the published works in the last 10 years including publicly available traffic sign data sets, and the detection stage of TSDR systems for intelligent transportation systems.
The remainder of the paper is organized as follows. Section 2 deals with the publicly available traffic sign data sets. Overviews of recent works on TSD are presented in section 3. In section 4, discussions and an outlook on future direction of research are presented. Section 5 concludes this paper.

DATA SETS
The quality of the TSD results varies with the method used by research groups. However, we can not decide which method gives better results, when these methods are evaluated using different data. For instance, it is impossible to know how the system responds to illumination, occlusion or disorientation problems of the signs since there is no clear speci-fication regarding the image data set used. Furthermore, some studies use small sets of images, or nonmiscellaneous data. In general, the compilation of sets of traffic signs images is a difficult task and consumes a lot of time , which explains the lack of standardised databases in the field.
Many research groups have presented publicly available traffic sign data sets. One of the most widespread data sets is the German Traffic Sign Recognition Benchmark (GTSRB), which has been presented in (Stallkamp et al., 2011). This data set was created for TSR competition at the International Joint Conference on Neural Networks (IJCNN) 2011. The GTSRB data set includes 51839 German traffic signs (39209 for training and 12630 for testing), in 43 classes. These classes have been divided into six subsets: speed limits, unique, danger, mandatory, derestriction and other prohibitory signs subsets. The size of the signs varies between 15×15 and 222×193 pixels, and contain 10% margin. All the images are annotated and the data set contains the original size and the locations of the region of interests (ROIs) information, which means that the results can be verified easily. The GTSRB is primarily oriented to the recognition process, since each image contains exactly one sign without much background. Regarding the detection problem, the German Traffic Sign Detection Benchmark (GTSDB) data set was created for a competition held at IJCNN 2013 (Houben et al., 2013). It contains 900 images (600 for training and 300 for testing) and divided into three categories (prohibitory, mandatory and danger signs). This division suits the properties of various detection approaches with different properties.
The other two large data sets are the Swedish Traffic Signs (STS) (STS, ), and the Belgium Traffic Signs (BTS) (BTS, ). The STS data set contains 20000 images, in which 20% are labelled. These images are represented in 7 different classes. The size of these images is 1280 × 960 and the signs varies between 3 × 5 and 263 × 248 pixels. The BTS data set includes more than 17000 images. It is divided into detection (10000 images) and classification (7000 images) data sets. Moreover, it includes video tracks, that serve the tracking purpose. The images and videos were recorded from Belgium roads.
Information about these and other data sets are illustrated in Table 1.
The images included in these data sets are captured within different climatic conditions, at various times, positions, and under various visibility conditions. Moreover, all these images, except for those in Spanish TSD, are annotated (in the STS data set, only the fifth frame is annotated). Among these data sets, the German benchmark is the biggest one. The majority of recent works refer to this data set to evaluate their work. Thus, a comparison between these works is possible since the data used is unique.

TRAFFIC SIGN DETECTION
TSD can be divided into color-based methods shape-based, and machine learning based ones, since traffic signs are designed in predetermined colors and shapes. However, some authors consider both cues (color and shape) to perform the detection process. Thus, we divided the detection methods into colorbased, shape-based methods, and machine learning based ones.

Color-based methods
Traffic signs are usually colored in strongly noticeable contrasting colors. Color-based methods refer to these colors to perform the detection. The segmentation techniques and color spaces used vary from research group to another. The most intuitive color space is the RGB. However, it is very sensitive to lighting changes, and its components highly correlated. Therefore, a normalized RGB space is used to overcome these problems.Authors in (Ruta et al., 2010) employ a color enhancement technique to extract blue, red and yellow regions. They emphasize the pixels where the color component is prevailing over the other components in the RGB space. In (Gudigar et al., 2016b), RGB space is used to segment the original image as a first step to detect traffic signs. ROIs are then segmented based on multiple thresholding techniques with a novel environmental selection strategy. The novel environment is computed using the global mean of the intensity values. It is selected to differentiate daylight vision from night environment. In (Liang et al., 2013), the thresholding in RGB color space is performed using SVM. The classifier is first trained using target colors, then the decision for each pixel is made according to the RGB components. To detect the white color, the achromatic decomposition of the image is usually used ) (Maldonado-Bascon et al., 2007) (Ellahyani and El Ansari, 2016. It was proposed in (Liu et al., 2002) and given as follows where r, g, and b are the brightness of the selected color, d represents the degree of extraction of an achromatic color. HSV and HSI spaces are also very popular because they are based on human color perception and invariant to illumination variations (Lahmyed et al., 2019). Many researchers (Pazhoumand-dar and Yaghoobi, 2013) (Souani et al., 2014) (Ellahyani et al., 2018), have used these color spaces. Some determine empirically fixed thresholds in the HSI space to perform the segmentation  . These thresholds define the range of each component in which lies the target color. HSI and HSV components are not trustworthy when dealing with white color. Therefore, the achromatic decomposition in (1) is used to perform the segmentation of white signs. Others (Ruta et al., 2010) refer to the color enhancement using Look Up Tables (LUTs) for H and S channels to ameliorate the performance of the segmentation. The general idea of these LUTs is that if a single channel has a low value, it could be enhanced by the other channel if its value is high. Once the LUTs are used, the image is normalized.
The main weakness of color-based methods is the fact that the color is not always reliable due to the weather condition changes, orientation of sings in relation to the sun, daytime, etc. These parameters varies frequently in outdoor scenes. Moreover, other objects with the same color as traffic signs appears frequently in the scenes. Therefore, colors are usually used to obtain the ROIs, not to perform the detection.

Shape-based methods
Traffic signs are always designed in specific shapes (circles, triangles, rectangles, etc.). Thus, shape aspect is very important for TSD. Most of color-based methods consider the geometric information together with the color information. Others use methods relied on reliable features and effective classifier. In the literatures, texture, shape, and colour features are heavily investigated (Hu and Li, 2016). Although several features are available in the literature, the choice of these features depends on the detection method itself. Researchers usually try to adapt these features to the TSD problem by integrating the principal cues of the signs. The first feature that comes to mind is the edge. Authors often refer to Sobel, Prewitt, or Canny detectors to extract the edges from grayscale images (Houben, 2011) (Ruta et al., 2011) (Deguchi et al., 2011 (Timofte et al., 2014). Hough transform is another technique that refer to the edge information to detect shapes. However, it is a time consuming method, thus, not appropriate for real-time applications (Gudigar et al., 2016a). Authors in (Greenhalgh and Mirmehdi, 2012) detects ROIs as maximally stable extremal regions (MSERs), which offers robustness to variations in lighting conditions. RGB normalization is employed to obtain grayscale images used by the MSER detector. Fast Fourier Transform (FFT) , is another geometric based method used in TSD. Histogram of Oriented Gradients (HOG) becomes one of the most common choices for TSD systems. It was introduced by Dalal et al. in (Dalal and Triggs, 2005) and first used for pedestrian detection. A set of HOG features are employed in (Overett et al., 2009) to design a classifier with a boosting approach to detect both pedestrians and traffic signs. In (Timofte et al., 2014), authors use Haar wavelet and HOG features together with SVM and adaboost classifiers to detect traffic sign among the images. Moreover, they combine 2D and 3D approaches to enhance their results. Authors in (Creusen et al., 2010) expanded the HOG features into RGB space. The integration of color cue in the feature improves the system performance. The major problem of the detection process using sliding window and designed features scheme is that they are usually not efficient enough to response to the requirements of real-time applications.
Shape-based methods are a good alternative when colors are missing or when it is hard to detect colors. The most of the existing methods refer to both color segmentation and geometric information to detect traffic signs. These methods should be able to avoid difficulties related to invoking colors for sign detection and robust to handle in-plane transformations such as translation, scaling and rotation.

Machine learning based methods
Recently, many researchers employed machine learning technics to detect traffic signs from images and video sequences. In fact, for traffic sign detection and recognition in complicated driving scenes, methods which using hand-craft features such are not robust enough for distinguishing real signs from fake ones (Yuan et al., 2019). Neural Networks (NN) is a popular choice for the detection process. Prem Kumar et al. in (Kumar et al., 2019) has used NNs for both classification and detection processes. In (Ellahyani and El Ansari, 2017a), random forests were used to segment scene images. Authors used a mean-shift clustering method as a pre-processing step. Then, the random forest classifier detects the regions with the desired colors. Likewise in (Ellahyani and El Ansari, 2017b), authors used SVMs to classify the color segmented blobs as triangles, circles, and rectangles.
Features extracted by deep learning methods can be more semantic compared to the classic machine learning algorithms (El Jaafari et al., 2020). In 2017 Shustanov et al. (Shustanov and Yakimov, 2017) employed convolutional neural networks (CNN) for traffic sign detection. Authors used cascaded detectors with HOG features and HAAR features separately to detect candidate positions of traffic sign in an image. In (Yuan et al., 2019), authors used an end-toend deep learning method for traffic sign detection in complex environments. A multi-resolution feature fusion network architecture is employed. Furthermore, they frame the traffic sign detection as a spatial sequence classification and regression task, and propose a vertical spatial sequence attention (VSSA) module to gain more context information for better detection performance. The well-known YOLO and SSD methods are real-time single stage detectors. SSD exploits multi-layer features for detection to improve performance. As single-shot detectors are more promising to be real-time, many single shot framework based methods (Shan and Zhu, 2019) (Gao et al., 2019) (Jin et al., 2020) are proposed for traffic sign detection systems. Table 2 lists an overview of different detection methods.

DISCUSSIONS AND FUTURE DIRECTIONS
Various approaches for TSD are presented in the previous sections. Here, a performance comparison of these methods and future directions of research in TSD are presented.
The identification of traffic signs among scene images is carried out by two main stages: detection, and recognition. Many research groups integrate a tracking stage to deal with successive frames of scene images (Moutarde et al., 2007) (González et al., 2011) (Meuter et al., 2011) (Ruta et al., 2011) (Keller et al., 2008. Each detected traffic sign is tracked over time by predicting its position in the next frame. Tracking process is carried out usually using Kalman filter (Ruta et al., 2010) (Ruta et al., 2008) (Fang et al., 2003). It strengthens TSDR systems since the detection and recognition use multiple images for the same traffic sign. Furthermore, the search space in the next frame is reduced, therefore, the memory and the execution time are reduced (Fang et al., 2003). However, we only focus on the detection and recognition modules, leaving the tracking for future works.
The detection step is carried out using color, shape, or both properties. Color is an important concept in TSD systems, since it can significantly reduce the amount of the region produced by low-level image processing operations. However, color segmenta-  (Shan and Zhu, 2019) 2019 --SSD (Ellahyani and El Ansari, 2017a) 2017 Mean shift clustering + RF Log-polar transform cross-correlation (Gao et al., 2019) 2019 --SSD (Wang et al., 2014) 2014 Thresholding HOG SVM (Madani and Yusof, 2016) 2016 Learning Vector Quantization (LVQ) Binary image Bitwise logical operator (Ellahyani et al., 2016b) 2016 Enhancement+Thresholding DtBs RF (Shustanov and Yakimov, 2017) 2017 HOG and Haar -CNN (Fleyeh and Davami, 2011) 2011 Thresholding Eigen vectors Euclidean distance (Bascón et al., 2010) 2010 Thresholding Furthermore, there are other objects with the same colors as traffic signs in the street. The segmentation process can be ameliorated by integrating preprocessing steps for color correction, enhancing the target colors, or selecting an optimum color space or a combination of many. On the other hand, the basic drawback of the shape-base methods is the number of false positives produced by these methods. This is due to the deficiency in color information (Boumediene et al., 2013). Therefore, the use of both cues (color and shape) leads to the best results. Table 2 illustrates a description of state-of-theart methods used in TSD within the last 10 years.
The methods shown in the table are listed in terms of segmentation method, feature used, and detection method. The quality of the results obtained by these research groups changes according to the method and the data used in their works. The use of different data sets, and the focus only on some categories such as danger or speed limit sings by some systems, make the comparison between these methods difficult. However, since the GTSDB data set was created for the competition held at IJCNN 2013, the majority of the TSD works use this data set to test the performance of their methods. The evaluation of the methods is done based on the precision-recall curves, where the recall and precision values are computed as follows recall = t p detected total t p × 100 (2) where tp is the true positives. Fig. 1 depicts the precision-recall curves of the 10 highest ranked results of the competition held at IJCNN 2013 (Houben et al., 2013). The results are presented for the prohibitive signs, mandatory signs, and danger signs categories of the GTSDB data set. Tables 3 4 5 list these results in terms of area under precision-recall curve (AUC) and the average overlap.
We can see that TSD is a very well studied problem, and many detection solutions have been proposed. During the last years, the main problem of TSR was the lack of standardized traffic sign image datasets. However, this problem is surmounted since many of these data sets are publicly available now. Thus, comparison with other state-of-the-art works is possible to evaluate the performance of the methods. At the moment, TSD systems still face many other problems such as • The interchanging between the TSDR individual modules is possible since these systems are long chains of different approaches. • Many of the proposed algorithms do not response to the requirements of real-time applications. • The focus on some categories of traffic signs such as danger or speed limits, although the detection and the recognition of other signs may be more interesting. • Traffic signs that are irrelevant to the road currently accessed by the driver can be detected (see Fig. 2). • Many of the publicly available data sets do not include images captured under unsuitable conditions (at night, cloudy weather, etc.)

CONCLUSIONS
In this paper, a description of state-of-the-art methods used in TSD within the last 10 years has been presented. The detection methods were divided into color-based, shape-based methods, and machine learning based ones, although many of these methods use all of these cues. Furthermore, a list of publicly available data sets as well as a comparison between detection methods involved in the IJCNN 2013 competition have been presented in this work. However, determining the best algorithm among existing ones is not easy, since each of these algorithms has its advantages and drawbacks. Despite all these efforts being made, TSDR is still considered as a big challenge for various research groups.   El Jaafari, I., El Ansari, M., Koutti, L., Mazoul, A., and Ellahyani, A. (2016b). Fast spatio-temporal stereo matching for advanced driver assistance systems. Neurocomputing, 194:24-33.