Anomaly Detection System for Internet Traffic based on TF-IDF and BFR Clustering Algorithms

    • Suad A. Alasadi
    • Wesam S. Bhaya
  • Anomaly Detection, IDS, Network Attacks, Clustering Data Mining, TF_IDF, BFR.
    An anomaly can be defined as any deviation from the normal and something which is outside the usual range of variations, it consumes network resources, and lead to security issues such as Confidentiality, Integrity, and Availability (CIA).An Intrusion Detection Systems (IDS) are designed and implemented by many researchers to analyze, detect, and prevent the anomaliestraffics. Although, there are various techniques for IDS to detect anomalies like statistical, machine learning techniques. Data mining can be efficiently employed for anomaly detection. Since, it works to extract features from network traffic; it can be used to distinguish between common legitimate and attack traffics. Data mining can be efficiently identifying the important data for user and predicts the results that can be utilized to detect various types of attacks.

    In this paper, an anomaly detection approach usingTerm Frequency Inverse Document Frequency(TF_IDF) and Bradley, Fayyad, and Reina(BFR) clustering algorithm is presented to detect and prevent malicious traffic efficiently and with low time complexity.Multiple types of attacks are detected in the proposed solution like (Flooding, Denial of Service (DoS), Backdoors, and Worms)attacks effectively using two modern datasets are which are“NUST2009, UNSW-NB2015â€.

    The experiments result shows that the BFR clustering algorithm perform better than the K-meanalgorithm in term of accuracy and detection rate. The overall accuracy for NUST2009 dataset is 99.2%, the detection rate is 100%, and false alarm rate is 0%. While the overall accuracy in UNSW-NB2015 dataset is 98.76, the detection rate is 79.28%, and false alarm rate is 0%.



    Received date: 2019-02-26

    Accepted date: 2019-02-26

    Published date: 2018-11-27