Issue |
E3S Web Conf.
Volume 501, 2024
International Conference on Computer Science Electronics and Information (ICCSEI 2023)
|
|
---|---|---|
Article Number | 01007 | |
Number of page(s) | 8 | |
Section | Applied Computer Science and Electronics for sustainability | |
DOI | https://doi.org/10.1051/e3sconf/202450101007 | |
Published online | 18 March 2024 |
A novel bagging- XGBoost ensemble model for attaining high accuracy and computational efficiency in network intrusion detection
Schoool of Computing and Informtaion Technology, Jomo-kenyata Unievrsity of Agriculture and technology, Naiorobi Kenya
* Corresponding author: nzuvah@gmail.com
The study focuses on enhancing network intrusion detection to enhance network security and prevent potential data breaches. We propose B-XGBoost, an ensemble learning model that combines bagging and boosting, using 10k cross-validation and Bayesian optimization for binary network intrusion classification. The proposed model was trained and tested on the CIC-ID2017 dataset. Decision Trees, Random Forests, Support Vector Machines, Naive Bayes, k-Nearest Neighbors, and Neural Networks were trained and tested on the same dataset for performance comparison purposes. The results show that the BXGBoost algorithm had the highest F1 Score (0.982), Precision (0.975), Recall (0.990), Cohen’s Kappa (0.978), and ROC AUC (0.983). The other algorithms had varying levels of performance, with the Decision Trees having the second-highest F1 Score (0.950). Bayesian optimization significantly reduced the time, computational efficiency, and cost of hyperparameter tuning by using a probabilistic model to predict hyperparameters that resulted in high performance. The high scores in F1, precision, recall, agreement with human annotators, and ability to distinguish between positive and negative instances demonstrate the effectiveness of this approach in enhancing network security. For the best results of the B-XGBoost to be obtained, the hyperparameters of the base model need to be tuned to achieve maximum computational efficiency in light of the available resources.
© The Authors, published by EDP Sciences, 2024
This is an Open Access article distributed under the terms of the Creative Commons Attribution License 4.0, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
Current usage metrics show cumulative count of Article Views (full-text article views including HTML views, PDF and ePub downloads, according to the available data) and Abstracts Views on Vision4Press platform.
Data correspond to usage on the plateform after 2015. The current usage metrics is available 48-96 hours after online publication and is updated daily on week days.
Initial download of the metrics may take a while.