Authors
Dept. of Information Systems, College of Science & Arts-Alnamas, University of Bisha, Saudi Arabia
https://orcid.org/0000-0003-2375-6911
Dept. of Science and Lab. Technology, College of Science and Technology, Jigawa State Polytechnic Dutse, Nigeria
Dept. of Information Systems, College of Science & Arts-Alnamas, University of Bisha, Saudi Arabia
Dept. of Computer Science, College of Computer Science and Information Tech., Jazan University, Saudi Arabia
Educational Technology Dept., College of Education, University of Bisha, Saudi Arabia
Abstract
Websitedefacement is the illegal electronic act of changing a website. In this paper, the capabilities of robust machine learning classifiers are exploited to select the best input feature set for evaluation of a website’s defacement risk. A defacement mining data set was obtained from Zone-H, a private organization, and a sample consisting of 93,644 data points was pre-processed and used for modelling purposes. Using multi-dimensional features as input, enormous modelling computations were carried out to determine the optimal outputs, in terms of performance. Reason and hackmode presented the highest contributions for the evaluation of website defacement, and were thus chosen as outputs. Various machine learning models were examined, and decision tree (DT), k-nearest neighbours (k-NN), and random forest (RF) were found to be the most powerful algorithms for prediction of the target model. The input variables ‘domain’, ‘system’, ‘web_server’, ‘redefacement’, ‘type’, ‘def_grade’, and ‘reason/hackmode’ were tested and used to shape the final model. Using the cross-validation (CV) technique, the key performance factors of the models were calculated and reported. After calculating the average scores for the hyperparameter metrics (i.e., max-depth, min-sample-leaf, weight, max-features, and CV), both targets were evaluated, and the learning algorithms were ranked as RF > DT > k-NN. The reason and hackmode variables were thoroughly analysed, and the average score accuracies for the reason and hackmode targets were 0.85 and 0.585, respectively. The results comprise a significant development, in terms of modelling and optimizing website defacement risk. This study successfully addresses key cybersecurity concerns, particularly website defacement.