A cluster Analysis for Binary Data Using Genetic Algorithms
-
2018-11-30 https://doi.org/10.14419/ijet.v7i4.30.28174 -
Binary Data, Clustering, Genetic Algorithms. -
Abstract
This research was initially driven by the lack of clustering algorithms that focus on binary data. A promising technique to analyze this type of data, namely Genetic Clustering for Unknown K (GCUK) became the main subject in this research. GCUK was applied to cluster four binary data and there is a presence of an imbalanced data in one of the data sets. The results show that GCUK is an efficient and effective clustering algorithm compared to K-means. The other contribution is the capability of GCUK for clustering the unbalanced data. Standard clustering algorithms cannot simply be applied to this type of data sets as it can cause a misclassification results.
Â
-
References
[1] Hruschka ER, Campello R, Freitas AA & de Carvalho A (2009), A Survey of Evolutionary Algorithms for Clustering/ Systems, Man, and Cybernetics, Part C: Applications and Reviews. IEEE Transactions 39(2), 133-155.
[2] Jain AK (2010), Data clustering: 50 years beyond K-means, Pattern Recognition Letters 31(8), 651-666.
[3] Ordonez C (2003), Clustering binary data streams with K-means. In DMKD03: ACM SIGMOD Workshop on Research Issues in Data Mining and Knowledge Discovery, 12-19
[4] Baragona R, Battaglia F, Polu, I. Evolutionary Statistical Procedures, Springer, Berlin and Heidelberg, (2011).
[5] Bandyopadhyay S, Maulik U (2002), Genetic Clustering for Automatic Evolution of Clusters and Application to Image Recognition. Pattern Recognition, 35, 1197-1208.
[6] Saharan S & Baragona R (2013), A New Genetic Algorithm for Clustering Binary Data with Application to Traffic Accidents in Christchurch. Far East Journal of Theoretical Statistics 45(1), 67-89.
[7] Lin HJ, Yang FW, Kao YT (2005), An Efficient GA-based Clustering Technique. Tamkang Journal of Science and Engineering 8(2), 113-122
[8] Maulik U, Bandyopadhyay S (2000), Genetic Algorithm-based Clustering Technique. Pattern Recognition 33(9), 1455-1465.
[9] Safe M, Carballido J, Ponzoni I & Brignole N (2004), On Stopping Criteria for Genetic Algorithms. Advances in Artificial Intelligence, 405-413.
[10] Milligan G, Cheng R (1996), Measuring the influence of individual data points in a cluster analysis. Journal of Classification 13(2), 315-335.
-
Downloads
-
How to Cite
Saharan, S., Yu Xian, W., & Baragona, R. (2018). A cluster Analysis for Binary Data Using Genetic Algorithms. International Journal of Engineering & Technology, 7(4.30), 550-552. https://doi.org/10.14419/ijet.v7i4.30.28174Received date: 2019-03-03
Accepted date: 2019-03-03
Published date: 2018-11-30