Implementation of Naive Bayes Classifier and Log Probabilistic for Book Classification Based on the Title

 
 
 
  • Abstract
  • Keywords
  • References
  • PDF
  • Abstract


    Book is an important medium for teaching in higher education. It is facilitated by a library or a reading room which enabled student and teacher to fulfill their references for teaching and learning activities. For easy searching, each book classified by categories. In our institution, Information Technology Major of State Polytechnic of Malang, those categories are specifics to computer science topics. Every book entry need to be classified accordingly and to perform such task, one need to understand major keywords of the book title to correctly classify the books. The problem is, not all the librarian have such knowledge. Therefore manually classifying hundreds and even thousands of book is an exhausting work. This research is focused on automatic book classification based on its title using Naive Bayes Classifier and Log Probabilistic. The Log Probabilistic implementation is to solve the probability calculation result that is too small that cannot be represented in a computer programming floating points variable type. The algorithm then implemented in a web application using PHP and MySQL database. Evaluation has been done using Holdout method for 240 training dataset and 80 testing dataset resulting in 75% of accuration. We also tested the accuracy using K-fold Cross Validation resulting in 66.25% of accuration.

     

     


     

  • Keywords


    Classification, Book, Naïve Bayes, Log Probabilistic, Machine Learning

  • References


      [1] Dewey, Melvil (2004), A Classification and Subject Index for Cataloguing and Arranging the Books and Pamphlets of a Library [Dewey Decimal Classification]. Project Gutenberg.

      [2] Pandu Kusuma, Abdi & Srirahayu, Ida (2016), Sistem Pencarian Katalog Buku Menggunakan Metode Naïve Bayes Classifier (NBC) Pada Aplikasi Mulia-Bookstore Berbasis Android. Antivirus: Jurnal Ilmiah dan Teknik Informatika. Vol. 10 No. 2 November 2016.

      [3] Frilsilya, Aisya & Yunanto, Wawan & Diah Kesuma Wardhani, Kartika (2016), Klasifikasi Kompetensi Tugas Akhir Secara Otomatis Berdasarkan Deskripsi Singkat Menggunakan Perbandingan Algoritma K-NN dan Naive Bayes. Vol. 5, No. 1.

      [4] Indranandita, Ainalia & Susanto, Budi & Rachmat, Antonius (2008), Sistem Klasifikasi Dan Pencarian Jurnal Dengan Menggunakan Metode Naïve Bayes Dan Vector Space Model. Jurnal Informatika. 4. 10.21460/inf.2008.42.48.

      [5] Kurniawan, Bambang (2012), Klasifikasi Konten Berita Dengan Metode TextMining. Jurnal Dunia Teknologi Informasi. USU.

      [6] Cahyanti, Apriliya Fitri (2015), Penentuan Model Terbaik pada Metode Naive Bayes Classifier dalam Menentukan Status Gizi Balita dengan Mempertimbangkan Independensi Parameter. Jurnal ITSMART.

      [7] Nugroho, Bhuono Agung (2005), Strategi Jitu Memilih Metode Statistik Penelitian dengan SPSS. Yogyakarta : Andi.

      [8] James, Gareth. Witten, Daniela. Hastie, Trevor. Tibshirani, Robert (2013), An Introduction to Statistical Learning: with Applications in R. Springer.

      [9] Kuhn, Max. Johnson, Kjell (2013), Applied Predictive Modeling. Springer

      [10] Tala, F. Z (2003), A Study of Stemming Effects on Information Retrieval in Bahasa Indonesia. M.S. thesis. M.Sc. Thesis. Master of Logic Project. Institute for Logic, Language and Computation. Universiteti van Amsterdam The Netherlands.

      [11] B. A. A. Nazief and M. Adriani (1996), Confix-stripping: Approach to stemming algorithm for Bahasa Indonesia. Internal publication, Faculty of Computer Science, University of Indonesia, Depok, Jakarta.

      [12] Hooshyar, D., Ahmad, R. B., Yousefi, M., Fathi, M., Horng, S. J., & Lim, H. (2016). Applying an online game-based formative assessment in a flowchart-based intelligent tutoring system for improving problem-solving skills. Computers & Education, 94, 18-36.


 

View

Download

Article ID: 26967
 
DOI: 10.14419/ijet.v7i4.44.26967




Copyright © 2012-2015 Science Publishing Corporation Inc. All rights reserved.