Securitizing big data characteristics used tall array and mapreduce

  • Authors

    • Wael Jum’ah Al_Zyadat Isra University
    • Faisal Y. Alzyoud Isra University
    • Aysh M. Alhroob Isra University
    • Venus Samawi Isra University
    2019-04-07
    https://doi.org/10.14419/ijet.v7i4.24404
  • Big Data, MapReduce, Tall Array, Veracity and Volume.
  • Abstract

    Volume, velocity, variety, veracity, and value are the main characteristics of big data; researchers consider them in the classification process. This study contemplates two of these characteristics (Data Volume and Veracity), as major attributes; the scale of data and accuracy proved to be issued in relation to varying boundaries. In the scenarios discussed by two methods, Tall array and MapReduce are used; as they were used to work with out-of-memory data. Tall array subdivides the data sets into small chunks that individually fit in memory, while MapReduce uses parallelization and distribution by enabling mapper function and reduce function respectively. Theoretical Model and Experimental simulation show that tall array method is more efficient compared to MapReduce as per F-Measure and Arithmetic Mean calculations; in tall array method, veracity is improved by 0.09 and 0.15 in respect to F-Mean and Arithmetic Mean, meanwhile volume is improved by 0.06 and 0.13.

     

  • References

    1. [1] Bamberger, M., Integrating Big Data Into The Monitoring And Evaluation Of Development Programmes, 2016, UN UN Global Pulse, ‘Integrating Big Data into the Monitoring and Evaluation of Development Programmes,’ 2016. p. 143.

      [2] Raj, P., et al., Big and Fast Data Analytics Yearning for High-Performance Computing, in High-Performance Big-Data Analytics2015, Springer. p. 67-99. https://doi.org/10.1007/978-3-319-20744-5_3.

      [3] Forum, W.E., Deep Shift Technology Tipping Points and Societal Impact, 2015: World Economic Forum. p. 44.

      [4] Hu, H., et al., Toward scalable systems for big data analytics: A technology tutorial. IEEE access, 2014. 2: p. 652-687. https://doi.org/10.1109/ACCESS.2014.2332453.

      [5] Russom, P., Big data analytics. TDWI Best Practices Report, Fourth Quarter, 2011: p. 1-35.

      [6] Demchenko, Y., C. Ngo, and P. Membrey, Architecture framework and components for the big data ecosystem. Journal of System and Network Engineering, 2013: p. 1-31.

      [7] Rahimi-Eichi, H. and M.-Y. Chow. Big-data framework for electric vehicle range estimation. in Industrial Electronics Society, IECON 2014-40th Annual Conference of the IEEE. 2014. IEEE.

      [8] Assunção, M.D., et al., Big Data computing and clouds: Trends and future directions. Journal of Parallel and Distributed Computing, 2015. 79: p. 3-15. https://doi.org/10.1016/j.jpdc.2014.08.003.

      [9] Arun, K. and D.L. Jabasheela, Big data: review, classification and analysis survey. International Journal of Innovative Research in Information Security (IJIRIS), 2014. 1(3): p. 17-23.

      [10] Zhang, M., et al., SafeDrive: Online Driving Anomaly Detection from Large-Scale Vehicle Data. IEEE Transactions on Industrial Informatics, 2017. https://doi.org/10.1109/TII.2017.2674661.

      [11] Zhang, M., et al., CarStream: an industrial system of big data processing for internet-of-vehicles. Proceedings of the VLDB Endowment, 2017. 10(12): p. 1766-1777. https://doi.org/10.14778/3137765.3137781.

      [12] Iwamura, K., et al., Big Data Collection and Utilization for Operational Support of Smarter Social Infrastructure. Hitachi Review, 2014. 63(1): p. 18.

      [13] Meijer, A.J., J.R. Gil-Garcia, and M.P.R. Bolívar, Smart City Research: Contextual Conditions, Governance Models, and Public Value Assessment. Social Science Computer Review, 2016. 34(6): p. 647-656. https://doi.org/10.1177/0894439315618890.

      [14] Morioka, M., et al., City management platform using big data from people and traffic flows. Hitachi Review, 2015. 64(1): p. 53.

      [15] Ramírez-Gallego, S., et al., Big Data: Tutorial and guidelines on information and process fusion for analytics algorithms with MapReduce. Information Fusion, 2018. 42: p. 51-61. https://doi.org/10.1016/j.inffus.2017.10.001.

      [16] Balint Antal, A.H., Diabetic Retinopathy Debrecen Data Set Data Set A. 2014, Editor 2017, the Messidor database:https://archive.ics.uci.edu/ml/datasets/Diabetic+Retinopathy+Debrecen+Data+Set#.

      [17] Wang, Y., L. Kung, and T.A. Byrd, Big data analytics: Understanding its capabilities and potential benefits for healthcare organizations. Technological Forecasting and Social Change, 2018. 126: p. 3-13. https://doi.org/10.1016/j.techfore.2015.12.019.

      [18] Madasamy, K. and M. Ramaswami, Data Imbalance and Classifiers: Impact and Solutions from a Big Data Perspective. International Journal of Computational Intelligence Research, 2017. 13(9): p. 2267-2281.

      [19] Juba, B. and H.S. Le, Precision-Recall versus Accuracy and the Role of Large Data Sets. 2017.

  • Downloads

  • How to Cite

    Jum’ah Al_Zyadat, W., Y. Alzyoud, F., M. Alhroob, A., & Samawi, V. (2019). Securitizing big data characteristics used tall array and mapreduce. International Journal of Engineering & Technology, 7(4), 5633-5639. https://doi.org/10.14419/ijet.v7i4.24404

    Received date: 2018-12-19

    Accepted date: 2019-01-04

    Published date: 2019-04-07