Correlation Feature Selection (CFS) and Probabilistic Neural Network (PNN) for Diabetes Disease Prediction


  • K Kalaiselvi
  • P Sujarani





Data mining, diabetes dataset, healthcare industry, Correlation Feature Selection (CFS), feature selection, Probabilistic Neural Network (PNN), machine learning repository and classifier.


The healthcare sector is a broad area with the abundance of patient information, which creates enormously large records day by day. Though the scientific industry is rich in information but it is poor in knowledge. Diabetics are considered as a primary health issue of the world. As per the WHO 2014 survey According to WHO 2014 report, over 422 million people are affected from the diabetics globally. In the minimization of massive investigations implied on the patients, the data mining uses many mechanisms and strategies to diagnose the diabetic problem. The main objective of this proposal is to introduce assemble Data Mining based Diabetes Disease Prediction System which provides a detailed analysis of diabetics using the database of diabetics patient. The formulated work comprises of two stages such as feature selection ad prediction methods which are made known to maximize the outputs of diabetes disease prediction. Initially Correlation Feature Selection (CFS) is formulated to identify the salient features for the diabetic repository. The identified features are fed into the classifier named Probabilistic Neural Network (PNN) classifier. As the diabetic of the patient is classified using PNN meanwhile the accuracy can be fine – tuned when using the identified features. Depending on the category of data, the diabetic information is gathered from the learning repository. The outputs are correlated with the current algorithms namely Back Propagation Neural Network (BPNN), Multilayer Perceptron, Neural Network (MLPNN) were used to fetch the outputs.




[1] World Health Organization, Diabetes Program.

[2] Parthiban G & Srivatsa SK, “Applying machine learning methods in diagnosing heart disease for diabetic patientsâ€, International Journal of Applied Information Systems (IJAIS), Vol.3, (2012), pp.2249-0868.

[3] Zhiren L, “Machine learning group at the university of Waikatoâ€, Weka. (2013-12-20), [2015-10-22].

[4] Hina S, Shaikh A & Sattar SA, “Analyzing Diabetes Datasets using Data Miningâ€, Journal of Basic and Applied Sciences, Vol.13, (2017), pp.466-471.

[5] Riccardo B & Blaz Z, “Predictive data mining in clinical medicine: current issues and guidelinesâ€, Int J Med Inf., Vol.77, (2008), pp.81–97.

[6] Gittens M, King R, Gittens C & Als A, “Post-diagnosis management of diabetes through a mobile health consultation applicationâ€, IEEE 16th International Conference on e-Health Networking, Applications and Services (Healthcom), (2014), pp.152-157.


[8] Patil BM, Joshi RC & Toshniwal D, “Hybrid prediction model for type-2 diabetic patientsâ€, Expert systems with applications, Vol.37, No.12, (2010), pp.8102-8108.

[9] Ahmad A, Mustapha A, Zahadi ED, Masah N & Yahaya NY, “Comparison between Neural Networks against Decision Tree in Improving Prediction Accuracy for Diabetes Mellitusâ€, In Digital Information Processing and Communications, (2011), pp.537-545.

[10] Marcano-Cedeño A, Torres J & Andina D, “A prediction model to diabetes using artificial met plasticityâ€, International Work-Conference on the Interplay between Natural and Artificial Computation, (2011), pp.418-425.

[11] Vijayan VV & Anjali C, “Decision support systems for predicting diabetes mellitus A Reviewâ€, Global Conference on Communication Technologies (GCCT), (2015), pp.98-103.

[12] Wu, H, Yang, S, Huang, Z, He, J & Wang, X, “Type 2 diabetes mellitus prediction model based on data miningâ€, Informatics in Medicine Unlocked, (2018), pp.100-107.

[13] Sowjanya K, Singhal A & Choudhary C, “MobDBTest: A machine learning based system for predicting diabetes risk using mobile devicesâ€, IEEE International on Advance Computing Conference (IACC), (2015), pp.397-402.

[14] Songthung P & Sripanidkulchai K, “Improving type 2 diabetes mellitus risk prediction using classificationâ€, 13th International Joint Conference on Computer Science and Software Engineering (JCSSE), (2016), pp.1-6.

[15] Chandrakar O & Saini JR, “Development of Indian weighted diabetic risk score (IWDRS) using machine learning techniques for type-2 diabetesâ€, Proceedings of the 9th Annual ACM India Conference, (2016), pp.125-128.

[16] Chetty N, Vaisla KS & Patil N, “An improved method for disease prediction using fuzzy approachâ€, Second International Conference on Advances in Computing and Communication Engineering (ICACCE), (2015), pp.568-572.

[17] Priyadarshini R, Dash N & Mishra R, “A Novel approach to predict diabetes mellitus using modified Extreme learning machineâ€, International Conference on Electronics and Communication Systems (ICECS), (2014), pp. 1-5.

[18] Karim M, Orabi1 YMK & Thanaa MR, “Early predictive system for diabetes mellitus diseaseâ€, ICDM 2016, LNAI, (2016), pp.420–427.

[19] Hall M, Correlation-based feature selection for machine learning, PhD Thesis, Department of Computer Science, Waikato University, New Zealand, (1999).

[20] Hall M & Smith, L, “Feature Selection for Machine Learning: Comparing a Correlation based Filter Approach to the Wrapperâ€, Twelfth International Florida Artificial Intelligence Research Society Conference, (1999), pp.235– 239.

[21] Saeys Y, Inza I & Larranaga P, “A review of feature selection techniques in bioinformaticsâ€, Bioinformatics, Vol.23, No.19, (2007), pp.2507–2517.

[22] Wu SG, Bao FS, Xu EY, Wang YX, Chang YF & Xiang QL, “A leaf recognition algorithm for plant classification using probabilistic neural networkâ€, IEEE International Symposium on Signal Processing and Information Technology, (2007), 11-16.

[23] Mishra S, Bhende CN & Panigrahi BK, “Detection and classification of power quality disturbances using S-transform and probabilistic neural networkâ€, IEEE transactions on power delivery, Vol.23, No.1, (2008), pp.280-287.

[24] G Abilbakieva, M Knissarina, K Adanov, S Seitenova, G Bekeshova (2018). Managerial competence of future specialists of the education system (Preschool education and upbringing) and medicine in the comparative aspect. Opción, Año 33, No. 85. 44-62.

[25] Akhpanov, S. Sabitov, R. Shaykhadenov (2018). Criminal pre-trial proceedings in the Republic of Kazakhstan: Trend of the institutional transformations. Opción, Año 33. 107-125.

View Full Article: