Performance evaluation of massive data standardization using multicore CPU and GPU

  • Authors

    • Saad Ahmed Dheyab Informatics Institute for Postgraduate studies, ICCI,iraq
    • Dr. Buthainah Fahran Abed
    • Dr. Mohammed Najm Abdullah
    2018-10-13
    https://doi.org/10.14419/ijet.v7i4.18058
  • Standardization, GPU, Massive Data, Preprocessing.
  • Standardization is one of the most important methods for the preprocessing phase in machine learning. It increases the quality of the results in terms of accuracy. Researchers have focused on the development of these preprocessing methods to suit the diversity of data generated from different sources. In this paper, three types of standardization methods (z score, min-max, log2) were applied to a mas-sive dataset using three different preprocessing approaches (CPU single core, CPU multicore open MP, and GPU) and evaluated their performance. From the results, these approaches showed a faster GPU performance compared to the conventional CPU performance.

     

  • References

    1. [1] Masek, Jan & Burget, Radim & Povoda, Lukas & Kishore Dutta, Malay. (2016). Multi–GPU Implementation of Machine Learning Algorithm using CUDA and OpenCL. International Journal of Advances in Telecommunications, Electrotechnics, Signals, and Systems. 5. https://doi.org/10.11601/ijates.v5i2.142.

      [2] Kirk, D. B., & Hwu, W. W. (2013). Programming massively parallel processors (2nd edition). Waltham, MA: Elsevier Inc.

      [3] Abdul Hay Bin Sulaiman, Muhamad & Suliman, Azizah & Ahmad, Abdul. (2014). Measuring GPU-accelerated parallel SVM performance using large datasets for multi-class machine learning problem. 299-302. 10.1109/ICIMU.2014.7066648.

      [4] M. Amaris, D. Cordeiro, A. Goldman, and R. Y. Camargo, “A simple bsp-based model to predict execution time in GPU applications,†in High-Performance Computing (HiPC), 2015 IEEE 22nd International Conference on, December 2015, pp. 285–294. https://doi.org/10.1109/HiPC.2015.34.

      [5] I. Komarov, A. Dashti, R. D Souza,“Fast k-NNG construction with GPU based quick multi-selectâ€,2013.

      [6] Benatia, Akrem & Ji, Weixing & Wang, Yizhuo & Shi, Feng. (2016). Machine Learning Approach for the Predicting Performance of SpMV on GPU. 894-901. https://doi.org/10.1109/ICPADS.2016.0120.

      [7] Saranya, C & Manikandan, G. (2013). A study on normalization techniques for privacy-preserving data mining. 5. 2701-2704.

      [8] Haddadpajouh, Hamed & Dastghaibyfard, Gholamhossein & Hashemi, Sattar. (2015). Two-tier network anomaly detection model: a machine learning approach. Journal of Intelligent Information Systems. https://doi.org/10.1007/s10844-015-0388-x.

      [9] P. S. Pacheco. An Introduction to Parallel Programming. University of San Francisco, 2013.

      [10] D. B. Kirk. Programming Massively Parallel Processors, Second Edition: A Hands-on Approach. 2013.

      [11] Ms. Ashwini M. Bhugul, (2017), “Parallel Computing using OpenMPâ€, International Journal of Computer Science and Mobile Computing, vol 6, issue 2, p90-94.

  • Downloads

  • How to Cite

    Ahmed Dheyab, S., Buthainah Fahran Abed, D., & Mohammed Najm Abdullah, D. (2018). Performance evaluation of massive data standardization using multicore CPU and GPU. International Journal of Engineering & Technology, 7(4), 4702-4705. https://doi.org/10.14419/ijet.v7i4.18058