MALWARE ANALYSIS ON THE UNSW_NB15 DATASET
Keywords:
Internet of Things (IoT), anomaly detection, Machine Learning (ML), cybersecurity, random forest, Multi-Layer Perceptron (MLP), Support Vector Machine (SVM), AdaBoost, UNSW_NB15 dataset, Intrusion Detection System (IDS)DOI:
https://doi.org/10.17654/0975045225016Abstract
The Internet of Things (IoT) is one of the latest advancements in the Internet. It is defined as a network of connected physical and digital devices and sensors that generate and exchange large amounts of data without human intervention. By eliminating the need for human operators, the IoT can process more data than ever before, faster, and more efficiently. This article focuses on IoT network security by investigating the usefulness of machine learning algorithms for detecting anomalies in IoT network data. It examines Machine Learning (ML) algorithms successfully used in relatively similar situations and compares them using several settings and methods.
This article implements the following algorithms:
- Random Forest (RF),
- Naive Bayes (NB),
- Multi-Layer Perceptron (MLP), a variant of the artificial neural network algorithm class,
- Support Vector Machine (SVM), and
- AdaBoost (ADA)
on the UNSW_NB15 dataset.
The best results were obtained by the Random Forest algorithm, with an accuracy of 99.3%.
References
L. Atzori, A. Iera and G. Morabito, The Internet of Things: a survey, Computer Networks 54(15) (2010), 2787-2805.
https://doi.org/10.1016/j.comnet.2010.05.010.
S. Sicari, A. Rizzardi, L. A. Grieco and A. Coen-Porisini, Security, privacy and trust in Internet of Things: the road ahead, Computer Networks 76 (2015), 146-164. https://doi.org/10.1016/j.comnet.2014.11.008.
R. Roman, J. Zhou and J. Lopez, On the features and challenges of security and privacy in distributed Internet of Things, Computer Networks 57(10) (2013), 2266-2279. https://doi.org/10.1016/j.comnet.2012.12.018.
M. Ahmed, A. N. Mahmood and J. Hu, A survey of network anomaly detection techniques, Journal of Network and Computer Applications 60 (2016), 19-31. https://doi.org/10.1016/j.jnca.2015.11.016.
Y. Meidan et al., ProfilloT: a machine learning approach for IoT device identification based on network traffic analysis, Proceedings of the Symposium on Applied Computing (SAC), 2018, pp. 506-509. https://doi.org/10.1145/3019612.3019878.
R. Doshi, N. Apthorpe and N. Feamster, Machine learning DDoS detection for consumer Internet of Things devices, Proceedings of IEEE Security and Privacy Workshops (SPW), 2018, pp. 29-35. https://doi.org/10.1109/SPW.2018.00013.
Raushan Kumar, Amit Kumar and S. K. Parida, Enhancing robustness of DC microgrid protection against weather intermittency and DER outage using optimal adaboost technique, Computers and Electrical Engineering 120(A) (2024), 109683. https://doi.org/10.1016/j.compeleceng.2024.109683.
I. Rish, An empirical study of the naive Bayes classifier, Proceedings of the IJCAI-01 Workshop on Empirical Methods in Artificial Intelligence, 2001, pp. 41-46. https://faculty.cc.gatech.edu/~isbell/reading/papers/Rish.pdf.
D. E. Rumelhart, G. E. Hinton and R. J. Williams, Learning representations by back-propagating errors, Nature 323(6088) (1986), 533-536.
https://doi.org/10.1038/323533a0.
C. Cortes and V. Vapnik, Support-vector networks, Machine Learning 20(3) (1995), 273-297. https://link.springer.com/article/10.1007/BF00994018.
Y. Freund and R. E. Schapire, Experiments with a new boosting algorithm, Proceedings of the Thirteenth International Conference on Machine Learning, 1996, pp. 148-156.
https://cseweb.ucsd.edu/~yfreund/papers/boostingexperiments.pdf.
N. Moustafa and J. Slay, UNSW_NB15: a comprehensive data set for network intrusion detection systems (UNSW_NB15 network data set), Proceedings of the 2015 Military Communications and Information Systems Conference (MilCIS), IEEE, 2015, pp. 1-6.
M. Sokolova and G. Lapalme, A systematic analysis of performance measures for classification tasks, Information Processing & Management 45(4) (2009), 427-437. https://doi.org/10.1016/j.ipm.2009.03.002.
L. Breiman, Random forests, Machine Learning 45(1) (2001), 5-32.
Downloads
Published
Issue
Section
License
Copyright (c) 2025 PUSHPA PUBLISHING HOUSE, PRAYAGRAJ, INDIA

This work is licensed under a Creative Commons Attribution 4.0 International License.
Attribution: Credit Pushpa Publishing House as the original publisher, including title and author(s) if applicable.
Non-Commercial Use: For non-commercial purposes only. No commercial activities without explicit permission.
No Derivatives: Modifying or creating derivative works not allowed without written permission.
Contact Pushpa Publishing House for more info or permissions.



Google h-index:
Downloads: