Improved Ant Colony on Feature Selection and Weighted Ensemble to Neural Network Based Multimodal Disease Risk Prediction (WENN-MDRP) Classifier for Disease Prediction Over Big Data


  • Gakwaya Nkundimana Joel
  • S Manju Priya





Disease prediction, machine learning, ensemble, neural network, prediction accuracy.


As the big data is growing in biomedical and healthcare communities, so are precise analyses of medical data aids, premature disease identification, patient care as well as community services. On the other hand, the accuracy of the analysis decreases, if the medical data quality is imperfect. As a result, the choice of features from the dataset turns out to be an extremely significant task. Feature selection has exposed its efficiency in numerous applications by means of constructing modest and more comprehensive models, enlightening learning performance and preparing clean and clear data. The proposed method analyzes the difficulties of feature selection for big data analytics.  Improved Ant Colony Optimization based Feature Selection (IACO) algorithm is presented for resolving this issue. The reconstruction of missing data before the incomplete data available was performed with help of latent factor mode. Therefore, it was not easy to choose the best features from the structured and unstructured data. the unheard technique which is called Weighted Ensemble Based Neural Network for multimodal disease risk prediction(WENN-MDRP) algorithm is implemented in order to provide the best features selection among structured as well as unstructured data. The research method provides improved prediction accuracy when matched with conventional techniques. In the MATLAB environment, the presented classifiers are implemented. The outcomes are computed in regard to recall, precision, accuracy, f-measure and error rate.




[1] Groves P, Kayyali B, Knott D & Van Kuiken S, “The ‘big data’revolution in healthcareâ€, McKinsey Quarterly, Vol.2, No.3, (2013), pp.1-22.

[2] Chen M, Mao S & Liu Y, “Big data: A surveyâ€, Mobile Networks and Applications, Vol.19, No.2, (2014), pp.171–209.

[3] Jensen PB, Jensen LJ & Brunak S, “Mining electronic health records: towards better research applications and clinical care,†Nature Reviews Genetics, Vol.13, No.6, (2012), pp.395–405.

[4] Qian B, Wang X, Cao N, Li H & Jiang YG, “A relative similarity based method for interactive patient risk predictionâ€, Data Mining and Knowledge Discovery, Vol.29, No.4, (2015), pp.1070–1093.

[5] Singh A, Nadkarni G, Gottesman O, Ellis SB, Bottinger EP & Guttag JV, “Incorporating temporal ehr data in predictive models for risk stratification of renal function deteriorationâ€, Journal of biomedical informatics, Vol.53, (2015), pp.220–228.

[6] Nasrabadi NM, “Pattern recognition and machine learningâ€, Journal of electronic imaging, Vol.16, No.4, (2007), pp.049901.

[7] Chen XW & Lin X, “Big data deep learning: challenges and perspectivesâ€, IEEE Access, Vol.2, (2014), pp.514-525.

[8] Subhapriya P, Sujatha R & Meghana K, “Healthcare Prediction Analysis in Big Data using Random Forest Classifierâ€, International Journal of Advance Research, Ideas and Innovations in Technology, (2017), pp.494-496.

[9] Jaseena KU & Binsu CK, “A Survey on Deep Learning Techniques for Big Data in Biometricsâ€, International Journal of Advanced Research in Computer Science, Vol.9, No.1, (2018), pp.12-17.

[10] Kathleen HM, Julia HM & George JM, “Diagnosing Coronary Heart Disease Using Ensemble Machine Learningâ€, International Journal of Advanced Computer Science and Applications, Vol.7, No.10, (2016), pp.30-39.

[11] Li, R., Liu, W., Lin, Y., Zhao, H & Zhang, C., 2017. An Ensemble Multi label Classification for Disease Risk Prediction. Journal of healthcare engineering, (2017), pp.1-11.

[12] Saxena K & Sharma R, “Efficient Heart Disease Prediction Systemâ€, Procedia Computer Science, Vol.85, (2016), pp.962-969.

[13] Jaseena KU & Kovoor BC, “A Survey on Deep Learning Techniques for Big Data in Biometricsâ€, International Journal of Advanced Research in Computer Science, Vol.9, No.1, (2018), pp.12-17.

[14] Farid DM, Marufand GM & Rahman CM., “A new approach of Boosting using decision tree classifier for classifying noisy dataâ€, IEEE International Conference on Informatics, Electronics & Vision, (2013), pp.1-4.

[15] Chen M, Hao Y, Hwang K, Wang L & Wang L, “Disease prediction by machine learning over big data from healthcare communitiesâ€, IEEE Access, Vol.5, (2017), pp.8869-8879.

[16] Chen M, Ma Y, Song J, Lai CF & Hu B, “Smart clothing: Connecting human with clouds and big data for sustainable health monitoringâ€, Mobile Networks and Applications, Vol.21, No.5, (2016), pp.825-845.

[17] Bates DW, Saria S, Ohno-Machado L, Shah A & Escobar G, “Big data in health care: using analytics to identify and manage high-risk and high-cost patientsâ€, Health Affairs, Vol.33, No.7, (2014), pp.1123–1131.

[18] Qiu L, Gai K & Qiu M, “Optimal big data sharing approach for tele-health in cloud computingâ€, IEEE International Conference on Smart Cloud (SmartCloud), (2016), pp.184–189.

[19] Zhang Y, Qiu M, Tsai CW, Hassan MM & Alamri A, “Health-CPS: Healthcare cyber-physical system assisted by cloud and big dataâ€, IEEE Systems Journal, Vol.11, No.1, (2017), pp.88-95.

[20] Gakwaya NJ & Manju PS, “Big Data Analytics in Healthcare and Delve Bioinformatics Data Space for Health Ameliorationâ€, International Journal of Computer Applications, Vol.180, No.9, (2018), pp.43-45.

[21] Kashima HN, Yamashita K, Ikai H & Imanaka Y, “Simultaneous modeling of multiple diseases for mortality prediction in acute hospital careâ€, 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, (2015), pp.855–864.

[22] Bandyopadhyay S, Wolfson J, Vock DM, Vazquez-Benitez G, Adomavicius G, Elidrisi M, Johnson PE & O’Connor PJ, “Data mining for censored time-to-event data: a bayesian network model for predicting cardiovascular risk from electronic health record dataâ€, Data Mining and Knowledge Discovery, Vol.29, No.4, (2015), pp.1033–1069.

[23] Hwang K and Chen M. Big-data analytics for cloud, IoT and cognitive computing. John Wiley & Sons, (2017).

[24] Chen H, Chiang RH & Storey VC, “Business intelligence and analytics: From big data to big impactâ€, MIS quarterly, Vol.36, No.4, (2012), pp.1165–1188.

[25] Basu Roy S, Teredesai A, Zolfaghar K, Liu R, Hazel D, Newman S & Marinez A, “Dynamic hierarchical classification for patient risk-of readmissionâ€, 21th ACM SIGKDD international conference on knowledge discovery and data mining, (2015), pp. 1691–1700.

[26] Villalobos Antúnez, JV (2017). Karl R. Popper, Heráclito y la invención del logos. Un contexto para la Filosofía de las Ciencias Sociales. Opción Vol. 33, Núm. 84. 5-11

[27] M Pallarès Piquer and O Chiva Bartoll (2017). La teoría de la educación desde la filosofía de Xavier Zubiri. Opción, Año 33, No. 82 (2017): 91-113

View Full Article: