METHOD FOR BUILDING A FEATURE VECTOR IN THE WEB APPLICATION FIREWALL ANOMALY DETECTION MODEL BY UTILIZING QUERY STATISTICS AND STRUCTURAL CONVERSION
Abstract
The widespread use of the Internet today, along with the rapid growth of cloud computing, the Internet of Things and smartphones have fueled the need for web-based apps. A web application firewall (WAF) is a type of unauthorized intrusion detection and prevention system designed to protect web applications. On WAF, attack recognition is often divided into two categories: anomalous and rule-based. Through observation of query data, models based on theoretical anomalies are able to detect undiscovered harmful queries. In this paper, we suggest an approach to characterizing vector construction by modification of the query string’s component parts’ structure and statistics. The unsupervised classification algorithm will then use the feature vector as input to determine which requests are anomalous. The results of testing the DBSCAN, K-means, and Isolation forest algorithms reveal that DBSCAN has the highest accuracy (F1-score >97%, accuracy >96%), especially for online applications like registration and authentication that are prone to misidentification. The effectiveness of this method stems from its ability to use data without pre-labeling, which facilitates deployment on the WAF