Classification Rule Generation for Cancer Prediction using Locality Sensitive Hashing Similarity Measure

 
 
 
  • Abstract
  • Keywords
  • References
  • PDF
  • Abstract


    This paper aims to develop a decision support system for healthcare in predicting stage of cancer (whether benign or malignant) using a novel classifier technique based on Locality Sensitive Hashing (LSH). We propose a new classification rule generations scheme based on Locality Sensitive Hashing. By applying LSH based classification instance selection algorithms, we get a minimal set of class representative patterns, on which we apply discretization and classification rule generation manually. Thus, have high chances of coming up with best prediction. Confusion matrix is used to compare test results. The above technique is applied on two datasets –Iris and Breast Cancer Wisconsin. We get better accuracy, specificity, sensitivity and precision than traditional classifiers. Manual diagnosis takes time and is a trial-error procedure and needs knowledge from medical specialists. We better the accuracy and speed of this manual procedure. classification model concept is used.


  • Keywords


    CBR (Case Based Reasoning); Discretization; Euclidean Distance Metric; Gaussian distribution; LSH (Locality Sensitive Hashing).

  • References


      [1] D. Rossille, J.-F. Laurent, and A. Burgun, “Modelling a decisionsupport system for oncology using rule-based and case-based reasoning

      methodologies,” International journal of medical informatics, vol. 74,

      no. 2-4, pp. 299–306, 2005.

      [2] C. Marling, M. Sqalli, E. Rissland, H. Munoz-Avila, and D. Aha, ˜

      “Case-based reasoning integrations,” AI magazine, vol. 23, no. 1, p. 69,

      2002.

      [3] J. Prentzas and I. Hatzilygeroudis, “Categorizing approaches combining rule-based and case-based reasoning,” Expert Systems, vol. 24,

      no. 2, pp. 97–122, 2007.

      [4] J. Kolodner, Case-based reasoning. Morgan Kaufmann, 2014.

      [5] R. Saraiva, M. Perkusich, L. Silva, H. Almeida, C. Siebra, and A. Perkusich, “Early diagnosis of gastrointestinal cancer by using case-based

      and rule-based reasoning,” Expert Systems with Applications, vol. 61,

      pp. 192–202, 2016.

      [6] H. Gomez-Vallejo, B. Uriel-Latorre, M. Sande-Meijide, B. Villamar ´ ´ınBello, R. Pavon, F. Fdez-Riverola, and D. Glez-Pe ´ na, “A case-based ˜

      reasoning system for aiding detection and classification of nosocomial

      infections,” Decision Support Systems, vol. 84, pp. 104–116, 2016.

      [7] A. Mansoul and B. Atmani, “Clustering to enhance case-based reasoning,” in Modelling and Implementation of Complex Systems. Springer,

      2016, pp. 137–151.

      [8] S. Petrovic, G. Khussainova, and R. Jagannathan, “Knowledge-light

      adaptation approaches in case-based reasoning for radiotherapy treatment planning,” Artificial intelligence in medicine, vol. 68, pp. 17–28,

      2016.

      [9] P. Chazara, S. Negny, and L. Montastruc, “Flexible knowledge representation and new similarity measure: Application on case based

      reasoning for waste treatment,” Expert Systems with Applications,

      vol. 58, pp. 143–154, 2016.

      [10] Y. Shen, J. Colloc, A. Jacquet-Andrieu, and K. Lei, “Emerging medical

      informatics with case-based reasoning for aiding clinical decision in

      multi-agent system,” Journal of biomedical informatics, vol. 56, pp.

      307–317, 2015.

      [11] J. Vilhena, H. Vicente, M. R. Martins, J. M. Graneda, F. Caldeira, ˜

      R. Gusmao, J. Neves, and J. Neves, “A case-based reasoning view of ˜

      thrombophilia risk,” Journal of biomedical informatics, vol. 62, pp.

      265–275, 2016.

      [12] A. Arnaiz-Gonz ´ alez, J.-F. D ´ ´ıez-Pastor, J. J. Rodr´ıguez, and C. Garc´ıaOsorio, “Instance selection of linear complexity for big data,”

      Knowledge-Based Systems, vol. 107, pp. 83–95, 2016.

      [13] X. Gu, Y. Zhang, L. Zhang, D. Zhang, and J. Li, “An improved

      method of locality sensitive hashing for indexing large-scale and highdimensional features,” Signal Processing, vol. 93, no. 8, pp. 2244–2255,

      2013.

      [14] M. Slaney and M. Casey, “Locality-sensitive hashing for finding nearest

      neighbors [lecture notes],” IEEE Signal processing magazine, vol. 25,

      no. 2, pp. 128–131, 2008.

      [15] J. Oliver, C. Cheng, and Y. Chen, “Tlsh–a locality sensitive hash,”

      in Cybercrime and Trustworthy Computing Workshop (CTC), 2013

      Fourth. IEEE, 2013, pp. 7–13.

      [16] L. Pauleve, H. J ´ egou, and L. Amsaleg, “Locality sensitive hashing: A ´

      comparison of hash function types and querying mechanisms,” Pattern

      Recognition Letters, vol. 31, no. 11, pp. 1348–1358, 2010.

      [17] G. Shakhnarovich, T. Darrell, and P. Indyk, Nearest-neighbor methods in learning and vision: theory and practice (neural information

      processing). The MIT press, 2006.

      [18] J. Leskovec, A. Rajaraman, and J. D. Ullman, Mining of massive

      datasets. Cambridge university press, 2014.

      [19] B. Van Durme and A. Lall, “Online generation of locality sensitive hash

      signatures,” in Proceedings of the ACL 2010 conference short papers.

      Association for Computational Linguistics, 2010, pp. 231–235.

      [20] T. Cover and P. Hart, “Nearest neighbor pattern classification,” IEEE

      transactions on information theory, vol. 13, no. 1, pp. 21–27, 1967.

      [21] A. Andoni and P. Indyk, “Near-optimal hashing algorithms for approximate nearest neighbor in high dimensions,” in Foundations of

      Computer Science, 2006. FOCS’06. 47th Annual IEEE Symposium on.

      IEEE, 2006, pp. 459–468.

      [22] S. Garcia, J. Derrac, J. Cano, and F. Herrera, “Prototype selection for

      nearest neighbor classification: Taxonomy and empirical study,” IEEE

      transactions on pattern analysis and machine intelligence, vol. 34,

      no. 3, pp. 417–435, 2012.

      [23] K. Bache and M. Lichman, “Uci machine learning repository,” 2013.


 

View

Download

Article ID: 14075
 
DOI: 10.14419/ijet.v7i4.14075




Copyright © 2012-2015 Science Publishing Corporation Inc. All rights reserved.