Outlier Detection using Clustering Techniques

  • Authors

    • Srividya .
    • S Mohanavalli
    • N Sripriya
    • S Poornima
    2018-07-20
    https://doi.org/10.14419/ijet.v7i3.12.16508
  • Outliner Detection, Data Mining, K Means, LOF, CLARA
  • An outlier is nothing but a pattern that is different compared to the other existing  patterns in a particular dataset. In some applications it is very important to understand and identify outliers. Detecting outlier is of major importance in many of the fields like cybersecurity, machine learning, finance, healthcare, etc., A clustering based method is proposed to detect outliers using different algorithms like k means, PAM, Clara, DBScan and LOF on different data sets like breast cancer, heart diseases, multi shaped datasets. This work aims to identify the best suitable method to detect the outliners accurately.

     

     
  • References

    1. [1] Petrovskiy, M. I. "Outlier detection algorithms in data mining systems." Programming and Computer Software 29.4 (2003): 228-237.

      [2] Dhaliwal, Parneeta, M. P. S. Bhatia, and Priti Bansal. "A cluster-based approach for outlier detection in dynamic data streams (KORM: k-median OutlieR miner)." arXiv preprint arXiv:1002. 4003(2010).

      [3] Souza, Alberto MC, and Joseé RA Amazonas. "An outlier detect algorithm using big data processing and internet of things architecture." Procedia Computer Science 52 (2015): 1010-1015.

      [4] Christy, A., G. Meera Gandhi, and S. Vaithyasubramanian. "Cluster Based Outlier Detection Algorithm for Healthcare Data." Procedia Computer Science 50 (2015): 209-215.

      [5] Loureiro, Antonio, Luis Torgo, and Carlos Soares. "Outlier detection using clustering methods: a data cleaning application." Proceedings of KDNet Symposium on Knowledge-based Systems for the Public Sector. Bonn, Germany. 2004.

      [6] Chandola, Varun, Arindam Banerjee, and Vipin Kumar. "Outlier detection: A survey." ACM Computing Surveys (2007).

      [7] Bhattacharya, Gautam, Koushik Ghosh, and Ananda S. Chowdhury. "Outlier detection using neighborhood rank difference." Pattern Recognition Letters 60 (2015): 24-31.

      [8] Toshniwal, Durga. "A framework for outlier detection in evolving data streams by weighting attributes in clustering." Procedia Technology 6 (2012): 214-222.

      [9] Cao, Lei, Qingyang Wang, and Elke A. Rundensteiner. "Interactive outlier exploration in big data streams." Proceedings of the VLDB Endowment 7.13 (2014): 1621-1624.

      [10] Gupta, Manish, et al. "Outlier detection for temporal data: A survey." IEEE Transactions on Knowledge and Data Engineering26.9 (2014): 2250-2267.

      [11] SREEVIDYA, SS. "Detection of Outliers in Data Stream Using Clustering Method." International Journal of Science, Engineering and Technology Research (IJSETR)/2015/2278-7798 4 (2015).

      [12] Kumar, Vijay, Sunil Kumar, and Ajay Kumar Singh. "Outlier Detection: A Clustering-Based Approach." International Journal of Science and Modern Engineering (IJISME), ISSN (2013): 2319-6386.

      [13] Jayakumar, G. D. S., and Bejoy John Thomas. "A new procedure of clustering based on multivariate outlier detection." Journal of Data Science 11.1 (2013): 69-84.

      [14] Papadimitriou, Spiros, et al. "Loci: Fast outlier detection using the local correlation integral." Data Engineering, 2003. Proceedings. 19th International Conference on. IEEE, 2003.

      [15] Christopher, T., and T. Divya. "A Study of Clustering Based Algorithm for Outlier Detection in Data streams." Proceedings of the UGC Sponsored National Conference on Advanced Networking and Applications. 2015. National Conference on Advanced Networking and Applications, 27th March 2015.

      [16] Breunig, Markus M., et al. "LOF: identifying density-based local outliers." ACM sigmod record. Vol. 29. No. 2. ACM, 2000.

      [17] Elahi, Manzoor, et al. "Efficient clustering-based outlier detection algorithm for dynamic data stream." Fuzzy Systems and Knowledge Discovery, 2008. FSKD'08. Fifth International Conference on. Vol. 5. IEEE, 2008.

      [18] Knox, Edwin M., and Raymond T. Ng. "Algorithms for mining distance based outliers in large datasets." Proceedings of the International Conference on Very Large Data Bases. Citeseer, 1998.

      [19] Singh, Janpreet, and Shruti Aggarwal. "Survey on outlier detection in data mining." International Journal of Computer Applications67.19 (2013).

      [20] Pachgade, Ms SD, and Ms SS Dhande. "Outlier detection over data set using cluster-based and distance-based approach. "International Journal of Advanced Research in Computer Science and Software Engineering 2.6 (2012).

      [21] Pamula, Rajendra, Jatindra Kumar Deka, and Sukumar Nandi. "An outlier detection method based on clustering." Emerging Applications of Information Technology (EAIT), 2011 Second International Conference on. IEEE, 2011.

  • Downloads

  • How to Cite

    ., S., Mohanavalli, S., Sripriya, N., & Poornima, S. (2018). Outlier Detection using Clustering Techniques. International Journal of Engineering & Technology, 7(3.12), 813-818. https://doi.org/10.14419/ijet.v7i3.12.16508