Study and analysis of feature selection problems and impact of bias in machine learning disease prediction models

  Authors

    • Anil Kumar Prajapati Institute of Computer Science, Vikram University Ujjain MP, India
    • Umesh Kumar Singh Institute of Computer Science, Vikram University Ujjain (MP)
    • Rekha Singh School of Computer Science & Information Technology, DAVV Indore (MP)
    • Arpita Shukla JNS Govt. PG College Shujalpur
  • Classification; Health Care; Fuzzy Logic; Machine Learning.
  Abstract

    In the current scenario machine learning is the branch of artificial intelligence being used in every field and medicine is one of them. In medical science, the use of machine learning techniques aims to improve patient care by collecting, and analyzing patient data, and designing advanced and intelligent tools and/or devices for disease detection using collective experience. ML technology detects patterns associated with specific diseases by analyzing large datasets that include various patient records, such as diabetes, blood pressure, cholesterol, X-rays, MRIs, CT scans, imaging data, and genomic information. ML algorithms compute the primary symptoms of the disease. Based on these calculations the disease is identified. Here it is necessary to have sufficient dataset and/or features for computation. The understanding of the ML model depends on the underlying feature to be used to identify the related problem. The fairness of a machine learning algorithm depends on which symptoms are selected to determine any disease. The selection of features for ML models is an important task, more or less features can make the model underfit or overfit. Incorrect determination of selected features can introduce bias into the model which can greatly affect the accuracy of the model. If the bias in the machine learning model is not properly tuned or the bias is tuned too high or too low then the prediction does not cover the underlined pattern. Diseases arise in different circumstances; each disease has its special characteristics. To cover all the basic parameters of each disease is a very tough task. If a basic attribute is missed and/or an attribute that has no relation to the disease is captured then the desired result of the model may be affected. In the proposed research paper, the feature selection problem and bias effect have been analyzed through the Support Vector Machine (SVM) and Logistic Regression (LR) algorithm.

  References

  Downloads

  How to Cite

    Anil Kumar Prajapati, Umesh Kumar Singh, Rekha Singh, & Shukla , A. . (2024). Study and analysis of feature selection problems and impact of bias in machine learning disease prediction models. International Journal of Engineering & Technology, 13(2), 182-188.