A Survey of DDOS Attacks Using Machine Learning Techniques

The DDoS attacks are the most destructive attacks that interrupt the safe operation of essential services delivered by the internet community's different organizations. DDOS stands for Distributed Denial Of Service attacks. These attacks are becoming more complex and expected to expand in number day after day, rendering detecting and combating these threats challenging. Hence, an advanced intrusion detection system (IDS) is required to identify and recognize an-anomalous internet traffic behaviour. Within this article the process is supported on the latest dataset containing the current form of DDoS attacks including (HTTP flood, SIDDoS). This study combines well-known grouping methods such as Naïve Bayes, Multilayer Perceptron (MLP), and SVM, Decision trees.


Introduction
Numerous kinds of network assaults arrive with expansion of computing networks, particularly the internet. International ransom ware virus called Wannacry has newly stopped internet services in around 156 countries. As per Kaspersky lab results throughout the fourth quarter, Botnet aided attacks were aimed at assets in nearly 69 countries. The final quarter also experienced the largest DDoSbased Botnet attack that lasts roughly 15.5 days 371 hours Crackers or dark hackers are constantly creating new forms of multilayered DDoS attacksthat happen mainly on a OSI network and application layer. Such attacks have used the spoofed IP addresses to confound source detection and conduct a huge-scale attack. These attacks are quite huge, as the attack traffic absolutely consumes the network spectrum at the peak, thus reducing the legal packets. Ironically, the victims are government entities, finance companies, defense forces and military agencies. Famous sites such as facebook, twitter, wiki leaks etc, had become victims of DDoS that also observed interruptions in routine maintenance resulting in financial failures, depletion of service and lack of access.
This article discusses the different methods of machine learning recognition such asSVM naïve bayesand decision trees for detecting and analyzing different forms of this attacks including, Smurf, UDP flood, HTTP flood. Herethe work has been performed on the novel dataset containing the new kinds of DDoS attacks since no specific data sets containing the current DDoS attacks can be on various layers, including SI-DDoS, HTTP flood [1]. A comparison analysis of the various classification methods is taken out -it's clear from empirical data that MPL has reached the best precision rate. more application server, flood the capacity or infrastructure of a targeted network. In the below figure1 shows an attack frequently results from several infected systems (e.g. a botnet) that flood the targeted network with traffic.

A.UDP FLOOD
UDP flood is a kind of Denial-of-Service (DoS) volumetric assault in which the attacker attacks and overcomes the host's random ports using IP packets consisting of User Datagram Protocol (UDP) packets. The below figure2 states the hosts look for applications related with certain datagrams throughout this form of attack.If none is detected, the host sends a "Unreachable Destination" packet returns to the sender. The result of this flood bombarding would be that the network is flooded and thus irresponsive to legitimate traffic.

B.ICMP(PING) Flood
Ping flood, is identified as ICMP flood, is a popular Denial of Service (DoS) attack where an attacker forces a victim's device down with flooding it with requests for ICMP echo, also called as pings.The figure3 will explains ICMP flood attack, thisattack includes overwhelming the victim's network by request packets, realizing the system will react with just as many reply packets as possible. File types to get a target down for ICMP requests also use custom software or code, like hping and scapy.

C.SMURF ATTACK
This is also a one of the ddos attack wherein massive groups of Internet Control Message Protocol (ICMP) packets mostly using spoofed source IP of the victim are broadcast over an IP broadcast address to a computer network. The below figure4 shows by default, many devices on a network will answer it by giving a response to the source IP address. If the quantity of systems over the network receiving then responding to such packet is quite high, so traffic can overwhelm the attacker's computer.

D.HTTP FLOOD ATTACK
An HTTP flood is a denial-of-service distributed volumetric (DDoS) attack, it is shown in the figure5, it is built to overburden a selected server with HTTP requests. When the target has also been filled with queries and cannot react to regular traffic, there will be denial-of-service for specific requests from actual users.

Machine learning methods related to ddos attack detection
Signature-based IDS is a human based operation, involving many hours of testing, developing and deploying the signature and creating new signature for unknown attacks too. So providing a less human based system becomes essential.Machine Learning languages derived anomaly-based IDS offers a solution to this issue, helping to incorporate a framework which can learn from data and predict unknown stats information on learned data.

A. Naïve bayes
Naive Bayes is focused on the Bayesian classification model. Establishing classifiers is an easy and simplest method: prototypes which gives class labels to issue cases, defined as the vectors of featuring values, in which the classes labels will be derived among certain finite set.
Kanagalakshmi. R et al.her paper indicated that the use of Secret Naïve Bayes (HNB) produces reliable results compared toStandard Naïve Bayes model. The Hidden Naive Bayes (HNB) technique could be used to anticipate intrusion problems such as DOS attacks benefiting from strongly associated dynamic characteristics and big network Data stream capabilities [13].
It is a paradigm of data mining that looses the naive techniques of Bayes Presumption of implicit impartiality. In his paper Mouhammad Alkasassbeh et al [1] collected a new dataset consisting of DDOS attacks in various layers of the network. DDoS detection is performed utilizing three Multilayer Perceptron (MLP) methodologies, Naïve Bayes, and Random Forest.
In [15] Jasreena Kaur Bains et al suggested a hierarchical layered method for the detection of attack rates. System used Naive Bayes classifier with K2 learning method between each attack class on reduced NSL KDD dataset. Each layer is trained in the research methodology to recognize a single form of attack. To raise the detecting rate, the output of one layer is moved on to another layer.

B.Support Vector Machine
Support Vector Machine (SVM) was at first introduced by Vapnik[7] and got significant attention in the research community of machine learning . SVM makes classification and regression using the supervised method of learning.Based on a group of trainied examples, each of which is marked as methods are divided into two classifications, an SVM algorithm creates a design which predicts that the new example tends to fall into one among the two.
In 2010 Vipin Das et al. [9] Work conducted using RST (rough set theory) and SVM (supporting vector machines) to classify DOS attacks; At first packets from the network were obtained, and the data is immediately processes by RST. The selected RST feature sets will be given to the SVM model for learning and testing, and so on. The results are then analyzed with PCA and show that RST and SMV are capable of doing so and the improving of efficiency is done by the false positive ratio.
T. Subbulakshmi et al [10] written an article aimed at tracking the online network and instantly activating a security techniquein the event of any suspicious behavior. This strategy allows for identification of both non-spoofed and spoofed IPs. Enhanced Support Vector Machines (ESVM) is used by the author to detect mechanisms for detecting spoofed IPs and Hop Count Filtering to find spoofed IPs These IPs are used to start the defense. The Lanchester Rule is used to determine the attack force used to cause the defence mechanism.
Rung-Ching Chen et al [11] written a paperat where RST and SMV were used to identify Dos Attacks supplied to SVM by specific feature set (obtained from RST); The report has wrote by T.Subbulakshmi et al [10] Focused on creating and detecting the DDoS dataset and using Enhanced Support Vector Machines(ESVCM). The EMCSVM are used to detect attacks in various classes for a generated dataset, and SVM is used for EMCSVM evaluation.

C. Decision Trees
One of the basic techniques used in machine learning and data mining is the decision tree. It is also utilized as a predictive model where findings regarding an object are mapped to assumptions about the desired value of the item.A decision tree may be used in the decision data analysis to visually and explicitly indicate decision making. The data set is studied and constructed in this method. Consequently, if the new data element is given for classification, the prior dataset will classify it appropriately.
Decision tree algorithm is used to detect the DOS attacks. In his article, HodaWaguih [2] suggested a data mining method for detecting DOS attacks, using classification methods. In the case of DoS attacks the approach above focuses its classification of "normal" traffic over "anomalous" traffic. The paper looks at the efficacy of the J48 a type of decision tree algorithm for DoS attack detection and then compares it with the other rule-based algorithms, such as Decision table and oneR.
Md Dewan. In their paper, Farid et al. [3] suggested an anomaly-based network intrusion detection learning algorithm which prevents attacks from regular activities and recognizes multiple kinds of intrusions utilizing decision tree algorithm. The data set used is KDD99 dataset for network intrusion detection.

D. Artificial Neural Network
Chandrika Palagiri demonstrated that, particularly for a specific attack, a modeling network can obtain a reasonable outcome to show a Neural Network.
Scientists also concentrate on a Neural Network which can take fast decisions and identify them in real time. Resilient back propagation (RBP) is selected as the basis classifier for the work in a paper wrote by Madhav Kale et al [21]. This paper focused on increasing the RBP classifier's efficiency through a fusion of classifier outputs and cost reduction approach from Neyman Pearson, for actual classification decision. The two factors evaluated to learn the RBP Boost classification algorithm's efficiency were Detection accuracy and cost per sample. The purpose of this paper by Md Salem et al [22] was to decide whether a firewall can examine its traffic patterns to recognize targeted denial of service. In this paper, a baseline of the network was determined by carrying out the statistical analyses of firewall logs for a hugenetwork.. In this paper a network baseline was calculated by performing statistical analysis of firewall logs for a wide network.Estimatedtraffic rates were calculated for comparison with the baseline utilizing linear regression and the Holt-Winter approaches. Analysis results were good, with deviation from the predicted rejected packet rates suggesting a massive campaign in the network.
In [23] author Mohammad Masoud Javidi et al introduced IDS, which uses supervised neural network to identify malicious of DDOS in the NSLKDD database. The researcher also used signature-based methodology in the proposed IDS. IDSs are developed using a neural network capable of detecting various kinds of DoS attacks and having a different IDS for every one to identify the particular attack.

E.K-means clustering
It is a clustering technique [5] widely used to partition a collection of data in groups k automatically. The K-means clustering algorithm works by choosing k initial cluster centers in a data set and then refining them recursively as describes 1. Every example shall be allocated to its nearest cluster core.
2. It updates the mean of its component cases to each of the cluster centres. The algorithm converges when the allocation of instances to clusters does not alter further Mangesh, D. Salunke et al [7] introduced a design which gathers packets, the packet is controlled by the specification like selection of features, and so on. Therefore k-means and naïve Bayes methods are being required to determine if the packet is usual or it is DOS attack.

Conclusion
It is concluded after a detailed analysis that web attacks are risky and that IDS / IPS may not tackle the new attacks that affect the networks. Machine learning approaches play a critical role in gaining exposure to the intensity of the assault and thereby making enterprises take suitable measures to limit certain attacks.

Future enhancement
A thorough study of the data sets containing the latest types of attacks such as HTTP flooding, SIDDoS, Smurf and UDP flooding etc., collected from the college network using deep learning techniques, will be carried out in future. It will allow the extent of attacks on the network connection or any organization to be assessed, so that the network is subject to correct firewall rules.