Detecting anomalous road traffic conditions using VGG19 CNN Model

Anomaly Detection on the real time road traffic has tremendous application possibilities in metropolitan road safety and traffic management. Due to the effect of numerous factors, for example: climate, viewpoints and road conditions in real-time traffic scene, Anomaly detection actually faces many difficulties. There are many reasons for vehicle accidents, for example: crashes, vehicle on flames and vehicle breakdowns, which exhibits distinctive and obscure behaviours. In this paper, we approached with a model to identify oddity in street traffic by monitoring the vehicle movement designs in two unmistakable


Introduction
An ever-increasing number of families presently have their own vehicles and going via vehicle has gotten an exceptionally normal and advantageous path in day-byday metropolitan life. The street condition consequently gets extraordinary consideration from the general population. Terrible street conditions can make monstrous misfortune to the social economy, and compromise the individual wellbeing of drivers out and about. With generally record the road situations by using traffic cameras, it is achievable. what's more, critical to build up a strategy to naturally discover the oddities on the streets utilizing PC vision procedures. A traffic checking framework furnished with these calculations will achieve numerous advantages and comforts. On one hand, when the inconsistencies occur, a programmed framework can advise the traffic police promptly, to tackle the oddities on streets at the earliest opportunity. Then again, when arranging an outing, data about street conditions can give accommodation to the two drivers and travellers. Notwithstanding, it is an exceptionally moving assignment to plan a PC vision calculation to recognize abnormalities in street traffic. One principal reason is that the development examples of vehicles on streets are generally exceptionally muddled, and distinctive irregular occasions may show complex practices. Simultaneously, the irregular occasion happens once in a while when contrasted with typical occasions. Accordingly, building up a productive and compelling clever calculation for programmed video abnormality location is a squeezing need. Numerous works of irregularity discovery in observation recordings must be applied to distinguish explicit strange occasions. For example, Mohammedi et al. build up a technique to distinguish human viciousness in recordings [4]. Additionally, the traffic finders just work in exceptionally restricted conditions [3]. With the improvement of traffic and video reconnaissance advancements, traffic the board frameworks dependent on video observation have gotten broadly utilized in rush hour gridlock the executives. In the canny preparing of traffic data, the discovery and acknowledgment of traffic inconsistencies, for example, securing, gridlock, car crashes, and unlawful driving have pulled in the consideration of numerous analysts because of its significance in rush hour gridlock the board. In any case, due to the intricacy of traffic camera feed, traffic reconnaissance video is helpless against outer factors, for example, light, climate, and checks. The constraints of existing picture preparing and investigation innovations make the traffic boundaries (flow of the traffic, density of traffic, vehicle direction, speed, and so forth) in light of traffic video extremely dubious. Looking the problem, we noticed it a bit difficult for manual surveillance. So, we came up with a proposal which is a CNN based surveillance system designed to learn for detecting the anomalies in dense traffic conditions, which handles of both the dynamic and static vehicles. Typically, the vehicles should continue to proceed onward the streets aside from certain ordinary conditions (e.g., hanging tight for traffic signals). Thusly, the static vehicles have higher likelihood for being anomalous occasions. By and large, the majority of the anomalous occasions in street deals will cause vehicle halting. For instance, vehicle slowing down, car crash or vehicle sticking. In the interim, the static vehicles can give us the exact area of the atypical occasions. To recognize the static vehicles from the moving traffic, we acquaint a static mode technique with get the running normal of the edge succession. Additionally, roused by the incredible accomplishment of deep learning in PC vision field, we send the deep learning-based strategy for the static vehicle identification. Additionally, we designed to further recognize the vehicle pictures of accidental pictures

Related Work
Vehicle monitoring, identification and tracking is the essential part in the street traffic scenario examination and assumes a significant part in many related applications, for example, driver help frameworks. Because of the extraordinary achievement of DL innovation, we have acquired a gigantic enhancement in image detection fields [2] [3] [4]. In this paper, we additionally experimenting the DL algorithms for detecting the vehicles and following. The detection of anomalies in both dynamic videos of the moving vehicles and static images of the traffic dataset have been concentrated in the previous years because of the expanding revenue in open security [3]. The customary techniques typically gain proficiency with the handcreated highlights to demonstrate the ordinary/anomalous occasion designs. As of late, deep learning innovation has been created for peculiarity detection as its accomplishment in the PC vision field [5].
In [6], the researchers present a generative adversarial network (GAN) based technique to identify the inconsistencies in pictures, utilizing just typical information to prepare the models. For observation recordings, there are a few endeavours to identify human savagery or strange occasions in group scenes. a deep peculiarity positioning system is used to foresee anomalies while testing on the recorded datasets. Since street condition assumes a significant part in our day-by-day life, recognizing peculiarities on streets has stood out from numerous researchers. For this undertaking, the main objective is to discover where and when the inconsistencies happen.
M. Schubert et al. [7] propose a mix of methods to foresee the event of street mishaps. The researchers present a visual examination structure for the investigation of ordinary social models and the detection of peculiar occasions. Be that as it may, the new works are intended to distinguish a particular strange occasion, and this present reality irregularities on streets are convoluted and assorted. Subsequently, we plan a novel double model strategy to distinguish different street traffic odd occasions in genuine scenes, which can have wide use practically speaking. Few more pulled in numerous specialists to direct top to bottom examination on traffic irregularity detection.
In, [8] the authors proposed a vehicle vision monitoring calculation dependent on PC vision, which had high tallying exactness and improved the precision of traffic surveillance. Frejlichowski et al. [9] proposed another vehicle direction design acknowledgment calculation dependent on the Cam shift calculation, which can precisely dissect and recognize the illicit leaving or unlawful turning of vehicles. Yang [11] successfully distinguish traffic occurrences of turnpike scenes by utilizing fuzzy logic (FL) by consolidating FL and improved steady examination calculations in their proposed system. The system breaks down occasions by extricating traffic flow data and vehicle speed, yet the system detection has few restrictions because of the intricacy of conditions of a traffic. In [10] the researchers utilized the GPS information of a hire taxi, which also including direction and velocity, to recognize gridlock on metropolitan streets. Albeit the precision of GPS strategy is high, also it has the limitations of significant expense, which compelled its application. In-order to identify these problems, another calculation that incorporates more traffic boundaries is proposed in this paper. The proposed calculation cannot just recognize traffic anomalies all the more precisely and carefully yet additionally be viable in various circumstances.

VGG19 Architecture
VGGNet is a CNN(Convolutional Neural Network) architecture proposed by Karen Simonyan and Andrew Zisserman from Oxford university in 2014. This paper principally centers around the impact of the CNN profundity on its precision. The contribution to VGG based convNet is a 224*224 RGB picture. Pre-processing layer takes the RGB picture with pixel esteems in the scope of 0±255 and deducts the mean picture esteems which are determined over the whole ImageNet training set. The input pictures subsequent to pre-processing are gone through these weight layers. The training pictures are gone through a stack of convolution layers. There is an aggregate of 13 convolutional layers and 3 fully connected layers in VGG16 engineering. VGG has more modest channels (3*3) with more profundity as opposed to having enormous channels. It has wound up having a similar powerful open field as though you just have one 7 x 7 convolutional layers. Another model of VGGNet has 19 weight layers comprising of 16 convolutional layers with 3 fully connected layers and a similar 5 pooling layers. In the two varieties of VGGNet, there comprises of two Fully Connected layers with 4096 channels every which are trailed by another fully connected layer with 1000 channels to foresee 1000 marks. The last fullyconnected layer utilizes a softmax layer for order purposes.
The walk through of 19 layers as follows: The initial two layers are convolutional layers with 3*3 channels, and initial two layers utilize 64 channels that outcomes in 224*224*64 volume as same convolutions are utilized. The channels are consistently 3*3 with step of 1. After this, pooling layer was utilized with max-pool of 2*2 size and step 2 which lessens tallness and width of a volume from 224*224*64 to 112*112*64. This is trailed by 2 more convolution layers with 128 channels. This outcomes in the new component of 112*112*128. Subsequent to pooling layer is utilized, volume is decreased to 56*56*128. Two more convolution layers are added with 256 channels each followed by down inspecting layer that diminishes the size to 28*28*256. Two more stack each with 3 convolution layer is isolated by a maximum pool layer.

Fig. 1. VGG19 Architecture
After the last pooling layer, 7*7*512 volume is straightened into Fully Connected (FC) layer with 4096 channels and softmax yield of 1000 classes.

ImageNet
ImageNet is a project which aims to provide a large image database for research purposes. It contains more than 14 million images which belong to more than 20,000 classes (or synsets). They also provide bounding box annotations for around 1 million images, which can be used in Object Localization tasks. It should be noted that they only provide urls of images and you need to download those images.

Traffic Net Dataset
The Traffic-Net dataset is a collection of traffic images that provide real-time monitoring, analytics, and alerts that will be used to provide a machine learning system with the data it needs to detect traffic conditions [12].
Using machine learning systems to accustom themselves to perception, understanding, and action in any environment is one of Deep Quest AI's goals [12].

Optimizer:
Adam is a deep learning algorithm that replaces stochastic gradient descent for the purpose of training models. The Adam algorithm combines the qualities of the AdaGrad and RMSProp algorithms to produce an optimization algorithm capable of handling sparse gradients on noisy problems. Adam can be configured quite simply, with default parameters that work out most of the time.

Proposed Method:
In this section, we are explaining the methodology followed to detect and classify the anomalies happened in various roads in various countries out of the varied traffic data-set. We are proposing an end-to-end deep learning architecture for traffic flow and anomalies detection and classification. We are experimenting using a VGG19 Convolutional Neural Network and ImageNet pretrained weights to discover the static vehicles on streets, as the abnormalities normally lead to vehicle halting. Typically, the majority of the inconsistencies on streets are unusual vehicle slowing down or vehicle crashes, which both make the vehicles stop in/alongside the street. In light of this perception, we acquaint a movement investigation strategy with discover the static vehicles which are met with accidents sometimes which get caught on fire on streets and further perceive the anomalous occasions dependent on that. The pipeline of peculiarity detection dependent on the static vehicle is introduced in Figure 2, 3, 4 5. The dataset which is properly bifurcated the data into two separate folders i.e., one is test dataset and another one is train data set. In each of those training and test dataset is categorised the four different scenarios labelled namely ³$FFLGHQW 6SDUVH WUDIILF ILUH DQG GHQVH WUDIILF´ ,QLWLDOO\ we took the training dataset to train the model to classify its type of class. In order to train the model, we read all the images in each category mentioned above in the training directory using the OpenCV library. After reading the whole dataset, the images are resized to same height and width in pixels. Here in this experiment, we are considering it as 300 x 300(height, width). To ease the process the whole dataset of images which has 3 channels of every image is converted into a binary scale image which means converting the image channel from RBG to binary image which is giving the range from Zero to One. In same way the test directory images also read and resized all the data into same shape and converted the images into binary images for validation purpose. After aligning the image data the pretrained VGG19 model and ImageNet weights are taken to create a model which is shown below in Fig.6.
Instead of building the whole convolutional network from scratch and initializing weights to create a model we have chosen the pretrained model which is VGG19. Any model or classifier would at the initial stage be able to detect the slant lines whatever the class we expect to classify as the first step, according to the intuition of choosing the pretrained model. The training of those in every possible opportunity to create a Neural network makes no sense. The project details that need to be trained will determine the abilities of the final layer of the network to identify classes. And also, there is one more advantage for using pretrained models which is pretrained models are always trained on big datasets or data which are not usually available to everyone. An example is ImageNet, which contains approximately 14 million images, 1.2 million of which are assigned to one of 1,000 categories. Thus, we would benefit tremendously from making use of these models´ >@ As with our previous experiment, instead of repeating the procedure for the first network and starting from scratch with random weight values, we can now use the saved weight values from the previous experiment as the initial weights for our new experiment. In this case, weights are initialized using pre-trained networks.
So, here we are taking the pretrained model and on top of it we are adding Flatten layer and dense layer. Using flatten layer function we flattened all the convoluted layers into a 1dimensional linear vector. Later drop out technique is used to dropout the nodes of the dense layer which is in our case having 1024 neurons in a dense layer out of it 30% of the neurons are dropped out at fully connected layer-1 and again out of 512 neurons of dense layer at fully connected layer 30% of them is dropped out. As the data we have chosen is having 4 different classes, so that the model has to classify the image of which class it belongs to. In order to do that the final layer has chosen a dense layer again and used SoftMax activation function which is used in the output layer of convolutional neural network for multi class classification problems. The model summary is shown in above figure-7. To compile the model, we have chosen the Adam optimization algorithm for stochastic gradient decent the AdaGrad and RMSProp algorithms to provide an optimization algorithm which adjusts the weights and bias of the network to help learning and reduces the loss. We have then selected a loss function for our optimization function, which is categorical cross entropy, which is typically used in multi-class classification problems. The purpose of this measure is to compare the probability distributions of two events. After the loss function the whole agenda is to find out the accuracy and loss mitigation of the model we have chosen. To do that we have used model.fit() function for fitting the model in which we are separating the whole training dataset into 64 batches and 10 epochs, which means the whole dataset into 64 parts and passed through the network for 10 iterations.

Results
We successfully implemented the VGG19 model to detect the anomalies of the road and achieved model accuracy of 95.62% and model loss of 13% in the eighth iteration and

Conclusion
In this paper we present a VGG19 CNN model-based traffic anomalies detection in urban traffic environments and achieved the highest accuracy of 95.6%. As this model is very feasible and accurate, when implements it in cloud environments and used for traffic surveillance purposes there will be a great chance of communicating the helpline at desperate situations and also helps to decreases the accident probability.