Scalable density based spatial clustering with integrated one-class SVM for noise reduction

  • DBSCAN, One-Class SVM, Noise Reduction, Clustering, Spark.
    Information extraction from data is one of the key necessities for data analysis. Unsupervised nature of data leads to complex computational methods for analysis. This paper presents a density based spatial clustering technique integrated with one-class SVM, a machine learning technique for noise reduction, a modified variant of DBSCAN called NRDBSCAN. Analysis of DBSCAN exhibits its major requirement of accurate thresholds, absence of which yields suboptimal results. However, identifying accurate threshold settings is unattainable. Noise is one of the major side-effects of the threshold gap. The proposed work reduces noise by integrating a machine learning classifier into the operation structure of DBSCAN. Further, the proposed technique is parallelized using Spark architecture, thereby increasing its scalability and its ability to handle large amounts of data. Experiments and comparisons with similar techniques indicate high scalability levels and high homogeneity levels in the clustering process.

    Ahmed, K. N., & Razak, T. A. (2018). Scalable density based spatial clustering with integrated one-class SVM for noise reduction. International Journal of Engineering & Technology, 7(2.9), 28-32.

    Received date: 2018-03-12

    Accepted date: 2018-03-25

    Published date: 2018-04-29