Classification of Imbalanced Malaria Disease Using Naïve Bayesian Algorithm
-
2018-03-18 https://doi.org/10.14419/ijet.v7i2.7.10978 -
Imbalanced Data, Malaria, Naïve Bayesian, Weka, R. -
Abstract
Malaria disease is one whose presence is rampant in semi urban and non-urban areas especially resource poor developing countries. It is quite evident from the datasets like malaria, dengue, etc., where there is always a possibility of having more negative patients (non-occurrence of the disease) compared to patients suffering from disease (positive cases). Developing a model based decision support system with such unbalanced datasets is a cause of concern and it is indeed necessary to have a model predicting the disease quite accurately. Classification of imbalanced malaria disease data become a crucial task in medical application domain because most of the conventional machine learning algorithms are showing very poor performance to classify whether a patient is affected by malaria disease or not. In imbalanced data, majority (unaffected) class samples are dominates the minority (affected) class samples leading to class imbalance. To overcome the nature of class imbalance problem, balancing the data samples is the best solution which produces the better accuracy in classification of minority samples. The aim of this research is to propose a comparative study on classifying the imbalanced malaria disease data using Naive Bayesian classifier in different environments like weka and using an R-language. We present here, clinical descriptive study on 165 patients of different age group people collected at medical wards of Narasaraopet from 2014-17. Synthetic Minority Oversampling Technique (SMOTE) technique has been used to balance the class distribution and then we performed a comparative study on the dataset using Naïve Bayesian algorithm in various platforms. Out of balanced class distribution data, 70% data was given to train the Naive Bayesian algorithm and the rest of the data was used for testing the model for both weka and R programming environments. Experimental results have indicated that, classification of malaria disease data in weka environment has highest accuracy of 88.5% than the Naive Bayesian algorithm accuracy of 87.5% using R programming language. The impact of vector borne disease is very high in medical applications. Prediction of disease like malaria is an hour of the need and this is possible only with a suitable model for a given dataset. Hence, we have developed a model with Naive Bayesian algorithm is used for current research.
Â
 Â
-
References
[1] Thanh Quang Bui and Hai Minh Pham (2016). Web‑ based GIS for spatial pattern detection: application to malaria incidence in Vietnam. Bui and Pham Springer plus 5: 1014: 1-14.
[2] S.T. Khot and R.K. Prasad (2015). Optimal Computer Based Analysis for Detecting Malarial Parasites. Proc. of the 3rd Int. Conf. on Front. of Intell. Comput. (FICTA) Advances in Intelligent Systems and Computing 327 vol 1: 69-80.
[3] Md Z Rahman, Leon id Roytman et al. (2015). Environmental Data Analysis and Remote Sensing for Early Detection of Dengue and Malaria. Proc. of SPIE Vol. 9112: 1-9.
[4] WHO Malaria Report (2016)
http://www.who.int/mediacentre/factsheets/fs387/en/
[5] World Malaria Report (2015) Pages-x, xi.
http://apps.who.int/iris/bitstream/ 10665/200018/ 1/ 9789241565158 _eng.pdf
[6] Salma Jamal, Vinita Periwal et al. (2013).Predictive modeling of anti-malarial molecules inhibiting apicoplast formation. BMC Bioinformatics 2013, 14:55, 1-8.
[7] Tsige Ketema and Ketema Bacha. (2013). Plasmodium vivax associated severe malaria complications among children in some malaria endemic areas of Ethiopia. BMC Public Health 2013, 13:637, 1-7.
[8] Bruno B Andrade, Antonio Reis-Filho et al. (2010). Severe Plasmodium vivax malaria exhibits marked inflammatory imbalance. Malaria Journal 2010, 9:13, 1-8.
[9] Guo Haixiang, Li Yijing, et al. (2016). Learning from class-imbalanced data: Review of methods and applications. Expert systems with applications: 1-49.
[10] Bartosz Krawczyk (2016). Learning from imbalanced data: open challenges and future directions. Prog Artif Intell: 1-12.
[11] Xiaoheng Deng, Weijian Zhong et al. (2016). An Imbalanced Data Classification Method Based On Automatic Clustering Under-Sampling. IEEE transaction: 1-8.
[12] Aida Ali, Siti Mariyam Shamsuddin et al. (2013). Classification with class imbalance problem: a review. International journal of Advances in Soft Computing and its Applications 5(3): 1-30.
[13] N. Poolsawad, C. Kambhampati et al. (2014). Balancing Class for Performance of Classification with a Clinical Dataset. Proceedings of the World Congress on engineering vol.1: 1-6.
[14] M. Mostafizur Rahman and D. N. Davis (2013). Addressing the Class Imbalance Problem in Medical Datasets. International Journal of Machine Learning and Computing 3(2): 224-228.
[15] Purnima Pandit, A. Anand. Artificial Neural Networks for Detection of Malaria in RBCs. 2016 AUG; arXiv: 1608.06627.
[16] Francis Bbosa, Ronald Wesonga et al. (2016). Clinical malaria diagnosis: rule‑based Classification statistical prototype.Springer Plus. 5:939.
[17] Chunqing Wu and Patricia JY Wong (2016). Multi-dimensional discrete Halanay inequalities and the global stability of the disease free equilibrium of a discrete delayed malaria model. Advances in Difference Equations. 2016:113.
[18] Meng-Hsiun Tsai, Shyr-Shen Yu et.al. (2015). Blood Smear Image Based Malaria Parasite and Infected-Erythrocyte Detection and Segmentation. TRANSACTIONAL PROCESSING SYSTEMS. J Med Syst 39: 118. DOI 10.1007/s10916-015-0280-9.
[19] Farah Zakiyah Rahmanti, Sutojo et al. (2015). Plasmodium Vivax Classification from Digitalization Microscopic Thick Blood Film Using Combination of Second Order Statistical Feature Extraction and K-Nearest Neighbour (K-NN) Classifier Method. IEEE 4th International Conference on Instrumentation, Communications, Information Technology, and Biomedical Engineering (ICICI-BME) Bandung.2015 November.2-3.
[20] Kshipra C. Charpe, Dr. V. K. Bairagi et al. (2015). Automated Malaria Parasite and there Stage Detection in Microscopic Blood Images. IEEE Sponsored 9th International Conference on Intelligent Systems and Control (ISCO).
[21] J. Somasekar, B. Eswara Reddy. (2015). Segmentation of erythrocytes infected with malaria parasites for the diagnosis using microscopy imaging. Elsevier - Computers and Electrical Engineering. 336–51.
[22] Rashmi Dubey, Jiayu Zhou et al. (2014). Analysis of sampling techniques for imbalanced data: An n = 648 ADNI study. Elsevier Neuro Image 87: 220–241.
[23] Wing W. Y. Ng, Junjie Hu et al. (2015). Diversified Sensitivity-Based Under sampling for Imbalance Classification Problems. IEEE TRANSACTIONS ON CYBERNETICS: 1-11.
[24] Yazan F, Roumani et al. (2013). Classifying highly imbalanced ICU data. Health care Manag Sci 16:119- 128.
[25] Jia Pengfei, Zhang Chunkai et al. (2014). A New Sampling Approach for classification of Imbalanced Data sets with High Density. IEEE transaction: 217-222.
[26] N. Poolsawad, C. Kambhampati et al. (2014). Balancing Class for Performance of Classification with a Clinical Dataset. Proceedings of the World Congress on engineering vol 1: 1-6.
[27] V.Garcia, J.S.Sanchez et al. (2012). On the Effectiveness of preprocessing methods when dealing with different levels of class imbalance. Elsevier –Knowledge Based Systems.13-21.
[28] Jaree Thongkam, Gauandong Xu et al. (2009). Toward breast cancer survivability prediction model through improving training space. Elsevier – Expert systems with Applications.12200-12209.
[29] Xing-Ming Zhao, Xin Li et al. (2007). Protein classification with imbalanced data. Wiley InterScience.1125-1132.
-
Downloads
-
How to Cite
Sajana, T., & R.Narasingarao, M. (2018). Classification of Imbalanced Malaria Disease Using Naïve Bayesian Algorithm. International Journal of Engineering & Technology, 7(2.7), 786-790. https://doi.org/10.14419/ijet.v7i2.7.10978Received date: 2018-04-02
Accepted date: 2018-04-02
Published date: 2018-03-18