Exploiting the Capabilities of Classifiers to Examine a Website Defacement Data Set

Elrasheed Ismail Mohommoud Zayid; Ibrahim Isah; Nadir Abdelrahman Ahmed Farah; Yagoub Abbker Adam; Omar Abdullah Omar Alshehri

doi:10.59992/IJCI.2024.v3n3p1

Exploiting the Capabilities of Classifiers to Examine a Website Defacement Data Set

Authors

Elrasheed Ismail Mohommoud Zayid

Dept. of Information Systems, College of Science & Arts-Alnamas, University of Bisha, Saudi Arabia

https://orcid.org/0000-0003-2375-6911

[email protected]

Ibrahim Isah

Dept. of Science and Lab. Technology, College of Science and Technology, Jigawa State Polytechnic Dutse, Nigeria

Nadir Abdelrahman Ahmed Farah

Dept. of Information Systems, College of Science & Arts-Alnamas, University of Bisha, Saudi Arabia

Yagoub Abbker Adam

Dept. of Computer Science, College of Computer Science and Information Tech., Jazan University, Saudi Arabia

Omar Abdullah Omar Alshehri

Educational Technology Dept., College of Education, University of Bisha, Saudi Arabia

Paper DOI

https://doi.org/10.59992/IJCI.2024.v3n3p1

Abstract

Websitedefacement is the illegal electronic act of changing a website. In this paper, the capabilities of robust machine learning classifiers are exploited to select the best input feature set for evaluation of a website’s defacement risk. A defacement mining data set was obtained from Zone-H, a private organization, and a sample consisting of 93,644 data points was pre-processed and used for modelling purposes. Using multi-dimensional features as input, enormous modelling computations were carried out to determine the optimal outputs, in terms of performance. Reason and hackmode presented the highest contributions for the evaluation of website defacement, and were thus chosen as outputs. Various machine learning models were examined, and decision tree (DT), k-nearest neighbours (k-NN), and random forest (RF) were found to be the most powerful algorithms for prediction of the target model. The input variables ‘domain’, ‘system’, ‘web_server’, ‘redefacement’, ‘type’, ‘def_grade’, and ‘reason/hackmode’ were tested and used to shape the final model. Using the cross-validation (CV) technique, the key performance factors of the models were calculated and reported. After calculating the average scores for the hyperparameter metrics (i.e., max-depth, min-sample-leaf, weight, max-features, and CV), both targets were evaluated, and the learning algorithms were ranked as RF > DT > k-NN. The reason and hackmode variables were thoroughly analysed, and the average score accuracies for the reason and hackmode targets were 0.85 and 0.585, respectively. The results comprise a significant development, in terms of modelling and optimizing website defacement risk. This study successfully addresses key cybersecurity concerns, particularly website defacement.

Download

Keywords

Website Defacement, Website Defacement Assessment, Classification Metrics, Website Hacktivism, Cyber Risks, Predict Cyber Threats